Taming IoT Data: Making Sense of Sensors with SQL Streaming

$ whoami ” • Hans-Peter Grahsl • working & living in Graz • technical trainer at • independent consultant & engineer • associate lecturer • @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland irregular conference speaker 2

WHAT IS STREAMING ! ❓ ! ❓ @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 3

“a type of data processing that is designed with infinite data sets in mind” — Tyler Akidau

Streaming == BIG DEAL 1. unbounded data sets are prevalent ➡ never-ending data streams need purpose-built systems 2. people crave for timely information ➡ stream processing technology aids lower latencies @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 5

BIGGEST Challenge? @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 6

These and many many more…

Today the choice is mine

Apache Kafka @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 9

STREAMING PLATFORM

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 11

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 12

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 13

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 14

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 15

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 16

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 17

Kafka’s streaming SQL engine @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 18

declarative stream processing language @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 19

skyrocketing developer productivity

unlocks streaming for the masses

KSQL’s Nature • built on top of Kafka Streams • SQL only (not embedded) • NO(!) coding skills required • extremely low entry barrier • familiar syntax and semantics • concise and expressive • joins, aggregations, windowing • UD(A)Fs and UDTFs coming soon… @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 22

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 23

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 24

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 25

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 26

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 27

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 28

KSQL Queries • per-record streaming with milliseconds latency • compiled into Kafka Streams applications • follow same execution model • distributed over multiple KSQL servers • two operation modes / deployment options: • interactive vs. headless @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 29

KSQL interactive mode • KSQL servers accessed via REST API • offers ad-hoc stream analytics • share streams & tables across users • used for exploration and during development

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 31

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 32

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 33

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 34

KSQL headless mode • streaming queries given by a SQL file • KSQL servers process SQL file • use case specific isolation • “locked-down” ➡ NO REST API access • used for production deployments

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 36

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 38

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 39

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 40

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 41

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 42

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 43

@hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 44

step 1 ingest sensor data @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 45

step 2 KSQL streaming @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 46

“You think that’s a database table you are querying now?” — Morpheus

“Instead, only try to realize the truth… there is no database table.” — Spoon Boy

step 3 connecting NoSQL @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 49

step 4 reactive notifications @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 50

step 5 live dashboards @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 51

MISSION accomplished

KSQL wrap-up • streaming with SQL … and nothing but SQL • scalable & fault-tolerant • deployable anywhere: cloud or on prem • viable for use cases of any size (XS … XXXL) • exactly-once delivery guarantee semantics @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 53

“If I have been faster it’s by streaming on the shoulders of Apache Kafka.” — my other self

Your obsession tells you to do batching. I tell you to walk away and stream with KSQL The choice is yours folks!

THANK YOU Q&A? https://bit.ly/2FaLr7w @hpgrahsl | #VDZ19 @VoxxedZurich, 19th March 2019, Switzerland 56