🤖Building a Telegram bot with Apache Kafka and ksqlDB

A presentation at NDC Sydney in October 2020 in by Robin Moffatt

Slide 1

Slide 1

🤖Building a Telegram bot with Apache Kafka and ksqlDB @rmoff #NDCSydney Robin Moffatt

Slide 2

Slide 2

Slide 3

Slide 3

Where’s my nearest carpark with available spaces?

Slide 4

Slide 4

How many spaces are available in this car park?

Slide 5

Slide 5

💡Tell me when a car park with spaces is available

Slide 6

Slide 6

📈How does occupancy vary over time?

Slide 7

Slide 7

$ whoami > Robin Moffatt (@rmoff) > Senior Developer Advocate at Confluent (Apache Kafka, not Wikis 😉) > Working in data & analytics since 2001 > Oracle ACE Director (Alumnus) http://rmoff.dev/talks · http://rmoff.dev/blog · http://rmoff.dev/youtube @rmoff | #NDCSydney

Slide 8

Slide 8

Slide 9

Slide 9

Telegram @rmoff | #NDCSydney

Slide 10

Slide 10

Don’t just tell me… show me! Demo code: https://rmoff.dev/carparks

Slide 11

Slide 11

carparks HTTP Kafka @rmoff | #NDCSydney

Slide 12

Slide 12

What are the key pieces of the design? @rmoff | #NDCSydney

Slide 13

Slide 13

Event Driven Alerts carparks HTTP Kafka @rmoff | #NDCSydney

Slide 14

Slide 14

K/V Lookups (materialised views) SELECT SPACES_AVAILABLE How many spaces are free at “ FROM CARPARK WHERE NAME=’WESTGATE’; Westgate carpark right now? ksqlDB ” Kafka CARPARK_EVENTS 42 CREATE TABLE CARPARK AS SELECT LATEST(… GROUP BY NAME “ There are 42 spaces free @rmoff | ” #NDCSydney

Slide 15

Slide 15

A schema… carparks HTTP @rmoff | #NDCSydney

Slide 16

Slide 16

A schema… 2020-10-14,12:28,Broadway,1132,921 2020-10-14,12:28,Kirkgate Centre,611,474 2020-10-14,12:28,Sharpe Street,98,63 ?! @rmoff | #NDCSydney

Slide 17

Slide 17

My kingdom for a schema! 2020-10-14,12:28,Broadway,1132,921 2020-10-14,12:28,Kirkgate Centre,611,474 2020-10-14,12:28,Sharpe Street,98,63 😍 { “ts”: “2020-10-14T12:28 UTC+1”, “name”: “Broadway”, “capacity”: 1132, “empty”: 921 } … @rmoff | #NDCSydney

Slide 18

Slide 18

Applying a schema to streams of data source_topic ksqlDB CREATE STREAM mySource (date VARCHAR , time VARCHAR , name VARCHAR , capacity INT ) WITH (KAFKA_TOPIC=’source_topic’, VALUE_FORMAT=’DELIMITED’); Kafka @rmoff | #NDCSydney

Slide 19

Slide 19

Applying a schema to streams of data source_topic ksqlDB Kafka derived_topic CREATE STREAM mySource (date VARCHAR , time VARCHAR , name VARCHAR , capacity INT ) WITH (KAFKA_TOPIC=’source_topic’, VALUE_FORMAT=’DELIMITED’); CREATE STREAM myTargetStream WITH (VALUE_FORMAT=’PROTOBUF’, KAFKA_TOPIC=’derived_topic’) AS SELECT * FROM mySource; @rmoff | #NDCSydney

Slide 20

Slide 20

Integration carparks HTTP Kafka @rmoff | #NDCSydney

Slide 21

Slide 21

Streaming Integration with Kafka Connect syslog Sources Tasks Workers Kafka Connect Kafka Brokers @rmoff | #NDCSydney

Slide 22

Slide 22

Streaming Integration with Kafka Connect Amazon S3 Google BigQuery Sinks Tasks Workers Kafka Connect Kafka Brokers @rmoff | #NDCSydney

Slide 23

Slide 23

Streaming Integration with Kafka Connect Amazon S3 syslog Google BigQuery Tasks Workers Kafka Connect Kafka Brokers @rmoff | #NDCSydney

Slide 24

Slide 24

Streaming Analytics @rmoff | #NDCSydney

Slide 25

Slide 25

Why build it this way? @rmoff | #NDCSydney

Slide 26

Slide 26

Events @rmoff | #NDCSydney

Slide 27

Slide 27

Streams of Events @rmoff | #NDCSydney

Slide 28

Slide 28

We want to react to them as they happen @rmoff | #NDCSydney

Slide 29

Slide 29

We want to build state from a stream of events @rmoff | #NDCSydney

Slide 30

Slide 30

We want to provide the latest data in our analytics @rmoff | #NDCSydney

Slide 31

Slide 31

Apache Kafka - an Event Streaming Platform Producer Connectors Consumer The Log Connectors Streaming Engine @rmoff | #NDCSydney

Slide 32

Slide 32

Why Kafka? @rmoff | #NDCSydney

Slide 33

Slide 33

Distributed, Immutable, Event Log New Old Events are added at the end of the log @rmoff | #NDCSydney

Slide 34

Slide 34

Consumers can seek to any point Read to offset & scan New Old @rmoff | #NDCSydney

Slide 35

Slide 35

Data is not deleted once read New Old Sally is here Scan @rmoff | #NDCSydney

Slide 36

Slide 36

Consumers are independent of each other New Old Fred is here Scan Sally is here Scan @rmoff | #NDCSydney

Slide 37

Slide 37

Consumers can be added later Rick is here Scan New Old Fred is here Scan Sally is here Scan @rmoff | #NDCSydney

Slide 38

Slide 38

Stream Processing with ksqlDB Source stream @rmoff | #NDCSydney

Slide 39

Slide 39

Stream Processing with ksqlDB Source stream @rmoff | #NDCSydney

Slide 40

Slide 40

Stream Processing with ksqlDB Source stream @rmoff | #NDCSydney

Slide 41

Slide 41

Stream Processing with ksqlDB Source stream Analytics @rmoff | #NDCSydney

Slide 42

Slide 42

Stream Processing with ksqlDB Source stream Applications / Microservices @rmoff | #NDCSydney

Slide 43

Slide 43

Stream Processing with ksqlDB …SUM(TXN_AMT) GROUP BY AC_ID AC _I D= 42 BA LA NC AC E= _I 94 D= .0 42 0 Source stream Applications / Microservices @rmoff | #NDCSydney

Slide 44

Slide 44

Under the covers of ksqlDB @rmoff | #NDCSydney Photo by on

Slide 45

Slide 45

Kafka cluster consume produce ksqlDB @rmoff | #NDCSydney

Slide 46

Slide 46

JVM Kafka cluster consume produce ksqlDB Kafka Streams @rmoff RocksDB | #NDCSydney

Slide 47

Slide 47

Slide 48

Slide 48

k & ^ Kafka Fully Managed as a Service B D l q s

Slide 49

Slide 49

Running ksqlDB - self-managed DEB, RPM, ZIP, TAR downloads http://confluent.io/download Docker images ksqlDB Server confluentinc/ksqldb-server (JVM process) …and many more… @rmoff | #NDCSydney

Slide 50

Slide 50

Why Kafka? @rmoff | #NDCSydney

Slide 51

Slide 51

Stream Store Process Integrate @rmoff | #NDCSydney

Slide 52

Slide 52

Stream Store Process Integrate @rmoff | #NDCSydney

Slide 53

Slide 53

Stream Store Process Integrate @rmoff | #NDCSydney

Slide 54

Slide 54

Stream Store Process Integrate @rmoff | #NDCSydney

Slide 55

Slide 55

Stream Store Process Integrate @rmoff | #NDCSydney

Slide 56

Slide 56

Flexible, event-driven applications Event-driven alerts Key/Value lookups Streaming ETL @rmoff | #NDCSydney

Slide 57

Slide 57

on Photo by Want to learn more? CTAs, not CATs (sorry, not sorry) @rmoff | #NDCSydney

Slide 58

Slide 58

Try it out for yourself https://rmoff.dev/carparks

Slide 59

Slide 59

60 DE VA DV $200 USD off your bill each calendar month for the first three months when you sign up https://rmoff.dev/ccloud Free money! (additional $60 towards your bill 😄 ) Fully Managed Kafka as a Service * T&C: https://www.confluent.io/confluent-cloud-promo-disclaimer

Slide 60

Slide 60

Learn Kafka. Start building with Apache Kafka at Confluent Developer. developer.confluent.io

Slide 61

Slide 61

Confluent Community Slack group cnfl.io/slack @rmoff | #NDCSydney

Slide 62

Slide 62

Further reading / watching https://rmoff.dev/kafka-talks @rmoff | #NDCSydney

Slide 63

Slide 63

#EOF https://talks.rmoff.net @rmoff