CloudEvents SQL Reached V1!
Announcing version 1 of CloudEvents SQL!
Calum Murray
July 15, 2024
It’s my pleasure to announce that the CloudEvents community has released version 1.0 of the CloudEvents SQL (CESQL) specification, along with implementations into SDKs for both Golang and Java. This is a result of 3 years of work in the CloudEvents community, and builds on top of the already released CloudEvents v1.0 specification, making it easier for developers to generate and manage events flowing through their infrastructure.
CloudEvents is a CNCF Graduated project that, at its core, defines a common set of metadata that most events already have, as well as how that metadata should be represented in the serialization of those events. CloudEvents aims to augment existing eventing flows by leveraging preexisting transport-level extensibility points. This is a really powerful concept, as it leads to strong interoperability between different eventing systems, whether they are providing, consuming, routing, transforming or otherwise interacting with the events and in many cases, even if they are not aware of CloudEvents.
For a CloudEvents specification to be approved, it must pass a majority vote from the eligible voters in the project. Voters become eligible by attending 3 of the previous 4 weekly meetings, and each company only gets a single vote. The CESQL v1 vote passed with 100% approval, and one eligible voter abstaining. Additionally, there were 4 non-eligible voters who voted yes and one who abstained.
A primary goal of the core CloudEvents specification is to allow event middleware to not worry about the various formats being used in the events it is processing. Building on that concept, CESQL allows middleware to programmatically examine events based on CloudEvents metadata without worrying about how to access the metadata on events, and instead worry about what the middleware wants to do with the events it is processing. In this way, CESQL is a natural extension on top of the original CloudEvents specification.
This is best explained by Doug Davis (Chair CNCF Serverless WG/CloudEvents; Microsoft):
The CloudEvents specification was the first step in our journey to address some of the pain-points the community was experiencing with managing the vast amount of events in today's distributed systems. CloudEvents focuses on the high-level concept of making key eventing metadata easily accessible without the need to understand, or even look at, the actual event payload itself. Thus enabling middleware to operate in a much more loosely-couple manner. With that core concept in place, one of the next logical area to focus on was the mechanism by which developers could express their event routing decisions based on that metadata. CESQL exactly fits that need. With CESQL, developers can express, in an interoperable SQL-like syntax, the criteria for when events meet certain requirement such that further actions should be taken. Freeing middleware managers (devs) from needing to understand the format, and even content, of each type of event flowing through the system. I'm very excited to see how the broader eventing and cloud native communities leverage this new specification, as the Knative community has already done.
To answer why an SQL based language was chosen instead of some other type of language, Francesco Guardiani (slinkydeveloper on GitHub; Restate) said that:
we were looking for a language that:
- is vendor neutral
- is easy enough to implement with a parser generator and a simple recursive expression evaluator
- not turing complete (for security reasons)
- could work out of the box with databases (that's mainly the reason for it to be a subset of sql expressions)
He went on to say:
The language went through a couple of iterations/ideas, initially it looked closer to a subset of C-like expressions (more javascript-ish), but this couldn't clearly work for database folks, and that's why we switched it to SQL-like.
The result of this initial work from Francisco Guardini was CESQL: an SQL-like language that provides a simple and consistent way for middleware and event processors to interact with CloudEvents.
With this basic understanding of why CESQL exists and what it is trying to do, let’s take a look at some sample expressions!
The simplest expression in CESQL is to address an attribute from a CloudEvent. For example, if you wanted to retrieve the id of an event all you would need to write is:
id
However, given that many of the CloudEvent attributes are optional (in particular, all of the “extension” attributes) you may want to use CESQL to check if an attribute exists on the event before you process it. For example, if you want to check if the subject attribute exists, this can be expressed really simply as:
EXISTS subject
You can also do comparisons on the values of attributes, and perform the basic Boolean Algebra operations (AND, OR, XOR). Using this, you can construct more complex expressions to filter events. For example, if you wanted an event where firstname or subject is ”Jane” you could write:
firstname = 'Jane' OR subject = 'Jane'
Similar to normal SQL, you can also do wildcard string comparisons. For
example, if you wanted to check if the type
of the event begins with github
and ends with success you could express this as:
type LIKE 'github%success'
CESQL also supports function calls, and allows users to provide their own functions to the engine. As an example, if you wanted to check that firstname and subject had the same value without worrying about the case sensitivity, you could use the LOWER function to compare the lowercase values of each attribute:
LOWER(firstname) = LOWER(subject)
There are many more operations and functions supported by CESQL than we can mention here, so we recommend that you read the specification and try out the Golang or Java SDK implementations of CESQL!
As the above examples illustrate, one of the primary use cases for CESQL is in event filtering. It is currently being used for this within Knative Eventing, a CNCF Incubating Project that provides Event Driven Architecture (EDA) primitives through Kubernetes Native interfaces. As Pierangelo Di Pilato (Principal Software Engineer ; Red Hat), the WG lead for Knative Eventing says:
As Knative Eventing's user base grew, the need for more expressive event filtering became apparent. CESQL addresses this challenge by introducing efficient SQL-like expressions, leveraging developers’ existing knowledge of SQL for powerful event routing and filtering.
To use a CESQL expression to filter an event in Knative, all you need to do is create a Trigger resource that defines a CESQL filter:
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
name: my-trigger
spec:
broker: default-broker
filters:
- cesql: "type LIKE github%success"
subscriber:
apiVersion: apps/v1
kind: Deployment
name: my-deployment
To try out the CESQL filters in Knative, you can use the upcoming Knative 1.15 release which will include the CESQL v1 implementation!
We hope that you give CESQL (and hence CloudEvents) a try! If there are any built in functions you feel we should add, please let us know through GitHub issues. Additionally, please open a GitHub issue or PR if you are interested in implementing the engine for CESQL (based on the ANTLR4 files) in any of the other SDKs that the CloudEvents community maintains.