From ksqlDB with ♥ LOVE ♥ Detecting 007 with a dash of machine learning

Hans-Peter Grahsl ‣ technical trainer ‣ independent engineer & consultant ‣ associate lecturer ‣ occasional conference speaker @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 2

@hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 3

@hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 4

@hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 5

@hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 6

Kafka’s streaming SQL engine @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 7

declarative stream processing language @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 8

KSQL in a Nutshell ‣ ANSI SQL inspired ‣ familiar syntax & semantics ‣ concise & expressive ‣ built on top of Kafka Streams ‣ NO(!) coding skills required ‣ entry barrier? “none” @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 9

KSQL in a Nutshell ‣ usual suspects OOTB: ‣ projections, filters ‣ joins, aggregations ‣ windowing @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 10

Query Examples @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 11

Criteria 1 - Learning Curve ‣ SQL == widespread & successful 4GL ‣ algorithms & data structures handled for us ‣ KSQL ‣ very easy to learn ‣ quick implementation cycles ‣ productivity-wise hard to beat @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 12

“Lingua Franca” effect @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 13

Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 14

Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 15

Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 16

Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 17

Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 18

NOT the ETL of our ancestors @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 19

Complex Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 20

KSQL ➔ originally the [T] only @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 21

What if I told you it’s as easy to build a STREAMING app as it is to build a CRUD app? @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 22

Unified Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 23

Unified Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 24

Unified Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 25

Connecting with Sources HINT: connector examples shown in demo later @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 26

Connecting with Sinks HINT: connector examples shown in demo later @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 27

Unified Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 28

Unified Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 29

Unified Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 30

Unified Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 31

Unified Streaming Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 32

Instead, only try to realize the truth… ksqlDB is NO DATABASE as we know it. @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 33

Data Concepts in ksqlDB STREAM ‣ immutable append-only sequence ‣ captures events representing a series of facts @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 34

Data Concepts in ksqlDB @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 35

Data Concepts in ksqlDB TABLE ‣ mutable collection of events ‣ holds the last known value for each key ‣ also result from stateful operations @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 36

Data Concepts in ksqlDB @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 37

Data Concepts in ksqlDB @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 38

ksqlDB: PUSH Queries ‣ act as subscription to query results ‣ fit asynchronous & reactive data flows ‣ run indefinitely ‣ new data causes continuous updates @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 39

ksqlDB: PUSH Query @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 40

ksqlDB: PULL Queries ‣ fetch point-in-time results ‣ fit request / response data flows ‣ terminate immediately ‣ lookup current state of materialized views @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 41

ksqlDB: PULL Query @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 42

ksqlDB Functions ‣ choose from three categories ‣ scalar ➔ UDF ‣ aggregation ➔ UDAF ‣ table ➔ UDTF @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 43

@hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 44

Criteria 2: Extensibility Options ‣ custom functions (UDFs, UDFAs & UDTFs) ‣ enable flexbile & powerful capabilities ‣ but Java code needed HINT: custom UDF example shown in demo later @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 45

Fictional Use Case @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 46

ksqlDB Use Case @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 47

ksqlDB Use Case @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 48

ksqlDB Use Case @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 49

ksqlDB Use Case @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 50

Data Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 51

Data Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 52

Data Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 53

Data Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 54

Data Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 55

Data Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 56

Data Architecture @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 57

Criteria 3: ML Integration Paths ‣ call fully-managed ML services (external) ‣ run your own model server (co-located) ‣ package home-brewed model into UDF (embedded) ‣ completely separated: integrate ML results via Connectors @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 58

Everywhere… ksqlDB runs everywhere @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 59

Criteria 4: Deployment Options ‣ run however you want ‣ bare metal, VMs, containers ‣ deploy wherever you need ‣ on-premises, private / public cloud, hybrid ‣ something fully-managed? @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 60

@hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 61

Unfortunately, no one can be told what ksqlDB is…You have to TRY IT for yourselves! https://ksqldb.io @hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 62

@hpgrahsl | #ConfluentVUG #ksqlDB #ApacheKafka | 2020-07-16 63

reach out to me @hpgrahsl