Kafka as a Platform: the Ecosystem from the Ground Up

A presentation at Kafka Summit London 2022 in April 2022 in London, UK by Robin Moffatt

Slide 1

Slide 1

Kafka as a Platform: the Ecosystem from the Ground Up Robin Moffatt @rmoff #kafkasummit

Slide 2

Slide 2

EVENTS @rmoff

Slide 3

Slide 3

EVENTS @rmoff

Slide 4

Slide 4

• • EVENTS d e n e p p a h g n i h t e Som d e n e p p a h t a Wh

Slide 5

Slide 5

Human generated events A Sale A Stock movement @rmoff

Slide 6

Slide 6

Machine generated events Networking IoT Applications @rmoff

Slide 7

Slide 7

EVENTS are EVERYWHERE @rmoff

Slide 8

Slide 8

EVENTS y r e v ^ are POWERFUL @rmoff

Slide 9

Slide 9

Slide 10

Slide 10

Slide 11

Slide 11

K V

Slide 12

Slide 12

LOG @rmoff

Slide 13

Slide 13

K V

Slide 14

Slide 14

K V

Slide 15

Slide 15

K V

Slide 16

Slide 16

K V

Slide 17

Slide 17

K V

Slide 18

Slide 18

K V

Slide 19

Slide 19

K V

Slide 20

Slide 20

Immutable Event Log Old New Events are added at the end of the log @rmoff

Slide 21

Slide 21

TOPICS @rmoff

Slide 22

Slide 22

Topics Clicks Orders Customers Topics are similar in concept to tables in a database @rmoff

Slide 23

Slide 23

PARTITIONS @rmoff

Slide 24

Slide 24

Partitions Clicks p0 P1 P2 Messages are guaranteed to be strictly ordered within a partition @rmoff

Slide 25

Slide 25

PUB / SUB @rmoff

Slide 26

Slide 26

PUB / SUB @rmoff

Slide 27

Slide 27

Producing data Old New Messages are added at the end of the log @rmoff

Slide 28

Slide 28

package main import ( “gopkg.in/confluentinc/confluent-kafka-go.v1/kafka” ) func main() { topic := “test_topic” p, _ := kafka.NewProducer(&kafka.ConfigMap{ “bootstrap.servers”: “localhost:9092”}) defer p.Close() p.Produce(&kafka.Message{ TopicPartition: kafka.TopicPartition{Topic: &topic, Partition: 0}, Value: []byte(“Hello world”)}, nil) }

Slide 29

Slide 29

Producing to Kafka - No Key Time Partition 1 Partition 2 Partition 3 Partition 4 Messages will be batched and randomly distributed across the partitions @rmoff

Slide 30

Slide 30

Producing to Kafka - With Key Time Partition 1 A Partition 2 B hash(key) % numPartitions = N Partition 3 C Partition 4 D @rmoff

Slide 31

Slide 31

Producers • A client application • Puts messages into topics • Handles partitioning, network protocol • Java, Go, .NET, C/C++, Python • Also every other language Plus REST proxy if not @rmoff

Slide 32

Slide 32

PUB / SUB @rmoff

Slide 33

Slide 33

Consuming data - access is only sequential Read to offset & scan Old New @rmoff

Slide 34

Slide 34

Consumers have a position of their own Old Victoria is here New Scan @rmoff

Slide 35

Slide 35

Consumers have a position of their own Old Victoria is here New Scan Tim is here Scan @rmoff

Slide 36

Slide 36

Consumers have a position of their own Rick is here Scan Old Victoria is here New Scan Tim is here Scan @rmoff

Slide 37

Slide 37

c, _ := kafka.NewConsumer(&cm) defer c.Close() c.Subscribe(topic, nil) for { select { case ev := <-c.Events(): switch ev.(type) { case *kafka.Message: km := ev.(*kafka.Message) fmt.Printf(“✅ Message ‘%v’ received from topic ‘%v’\n”, string(km.Value), string(*km.TopicPartition.Topic)) } } }

Slide 38

Slide 38

Consuming From Kafka - Single Consumer Partition 1 App1 Partition 2 Partition 3 Partition 4 @rmoff

Slide 39

Slide 39

Consuming From Kafka - Multiple Consumers Partition 1 A1 1 App Partition 2 Partition 3 App2 Partition 4 @rmoff

Slide 40

Slide 40

Consuming From Kafka - Grouped Consumers Partition 1 A App 1 App11 Partition 2 Partition 3 App2 Partition 4 @rmoff

Slide 41

Slide 41

Consuming From Kafka - Grouped Consumers Partition 1 Partition 2 Partition 3 C1 C2 App1 Partition 4 @rmoff

Slide 42

Slide 42

Consuming From Kafka - Grouped Consumers Partition 1 Partition 2 Partition 3 C1 C2 App1 Partition 4 @rmoff

Slide 43

Slide 43

Consuming From Kafka - Grouped Consumers Partition 1 Partition 2 Partition 3 C1 3 App1 Partition 4 @rmoff

Slide 44

Slide 44

Consuming From Kafka - Grouped Consumers Partition 1 Partition 2 Partition 3 C1 3 App1 Partition 4 @rmoff

Slide 45

Slide 45

Consumers • A client application App App11 A App2 • Reads messages from topics • Horizontally, elastically scalable (if stateless) • Java, Go, .NET, C/C++, Python, everything else Plus REST proxy if not @rmoff

Slide 46

Slide 46

BROKERS and REPLICATION @rmoff

Slide 47

Slide 47

Leader Partition Leadership and Replication Follower Partition 1 Partition 2 Partition 3 Partition 4 Broker 1 Broker 2 Broker 3 @rmoff

Slide 48

Slide 48

Leader Partition Leadership and Replication Follower Partition 1 Partition 1 Partition 1 Partition 2 Partition 2 Partition 2 Partition 3 Partition 3 Partition 3 Partition 4 Partition 4 Partition 4 Broker 1 Broker 2 Broker 3 @rmoff

Slide 49

Slide 49

Leader Partition Leadership and Replication Follower Partition 1 Partition 1 Partition 1 Partition 2 Partition 2 Partition 2 Partition 3 Partition 3 Partition 3 Partition 4 Partition 4 Partition 4 Broker 1 Broker 2 Broker 3 @rmoff

Slide 50

Slide 50

So far, this is Pretty good @rmoff

Slide 51

Slide 51

So far, this is Pretty good but I’ve not finished yet… @rmoff

Slide 52

Slide 52

Streaming Pipelines Amazon S3 RDBMS HDFS @rmoff

Slide 53

Slide 53

Slide 54

Slide 54

Streaming Integration with Kafka Connect syslog Sources Kafka Connect Kafka Brokers @rmoff

Slide 55

Slide 55

Streaming Integration with Kafka Connect Amazon Sinks Google Kafka Connect Kafka Brokers @rmoff

Slide 56

Slide 56

Streaming Integration with Kafka Connect Amazon syslog Google Kafka Connect Kafka Brokers @rmoff

Slide 57

Slide 57

Look Ma, No Code! { “connector.class”: “io.confluent.connect.jdbc.JdbcSourceConnector”, “connection.url”: “jdbc:mysql://asgard:3306/demo”, “table.whitelist”: “sales,orders,customers” } @rmoff

Slide 58

Slide 58

Extensible Connector Transform(s) Converter @rmoff

Slide 59

Slide 59

hub.confluent.io @rmoff

Slide 60

Slide 60

K V

Slide 61

Slide 61

K V

Slide 62

Slide 62

K V

Slide 63

Slide 63

K V

Slide 64

Slide 64

K V ? s i h t s ’ t a h w … t i a W

Slide 65

Slide 65

Lack of schemas – Coupling teams and services 2001 2001 Citrus Heights-Sunrise Blvd Citrus_Hghts 60670001 3400293 34 SAC Sacramento SV Sacramento Valley SAC Sacramento County APCD SMA8 Sacramento Metropolitan Area CA 6920 Sacramento 28 6920 13588 7400 Sunrise Blvd 95610 38 41 56 38.6988889 121 16 15.98999977 -121.271111 10 4284781 650345 52 @rmoff

Slide 66

Slide 66

Serialisation & Schemas JSON Avro Protobuf Schema JSON CSV @rmoff

Slide 67

Slide 67

Serialisation & Schemas JSON Avro Protobuf Schema JSON CSV 👍 👍 👍 😬 https://rmoff.dev/qcon-schemas @rmoff

Slide 68

Slide 68

Schemas Schema Registry Topic producer … consumer @rmoff

Slide 69

Slide 69

A Consumer 1 Consumer @rmoff

Slide 70

Slide 70

A1 Consumer Consumer @rmoff

Slide 71

Slide 71

{ @rmoff

Slide 72

Slide 72

Slide 73

Slide 73

Slide 74

Slide 74

.stream(“widgets”, Consumed.with(stringSerde, widgetsSerde)) .filter( (key, widget) -> widget.getColour().equals(“RED”) ) .to(“widgets_red”, Produced.with(stringSerde, widgetsSerde));

Slide 75

Slide 75

A1 Consumer Consumer @rmoff

Slide 76

Slide 76

Streams A1 App Streams App @rmoff

Slide 77

Slide 77

Stream Processing with ksqlDB Stream: widgets ksqlDB CREATE STREAM widgets_red AS SELECT * FROM widgets WHERE colour=’RED’; Stream: widgets_red @rmoff

Slide 78

Slide 78

{ @rmoff

Slide 79

Slide 79

FROM WIDGETS WHERE WEIGHT_G > 120 { SELECT COUNT(*) FROM WIDGETS GROUP BY PRODUCTION_LINE SELECT AVG(TEMP_CELCIUS) AS TEMP FROM WIDGETS GROUP BY SENSOR_ID HAVING TEMP>20 ‘connector.class’ = ‘S3Connector’, ‘topics’ = ‘widgets’ …);

Slide 80

Slide 80

ksqlDB or Kafka Streams? @rmoff Photo by Ramiz Dedaković on Unsplash

Slide 81

Slide 81

Standing on the Shoulders of Streaming Giants ksqlDB Powered by Ease of use ksqlDB UDFs Kafka Streams Powered by Producer, Consumer APIs Flexibility @rmoff

Slide 82

Slide 82

Summary @rmoff

Slide 83

Slide 83

@rmoff

Slide 84

Slide 84

K V @rmoff

Slide 85

Slide 85

K V @rmoff

Slide 86

Slide 86

The Log @rmoff

Slide 87

Slide 87

Producer Consumer The Log @rmoff

Slide 88

Slide 88

Producer Consumer The Log Connectors @rmoff

Slide 89

Slide 89

Producer Consumer The Log Connectors Streaming Engine @rmoff

Slide 90

Slide 90

Apache Kafka Producer Consumer The Log Connectors Streaming Engine @rmoff

Slide 91

Slide 91

Producer Security Schema Registry Consumer The Log Streaming Engine ksqlDB REST Proxy Connectors Confluent Control Center

Slide 92

Slide 92

F2 00 OF RM Free money! (additional $200 towards your bill 😄 ) Fully Managed Kafka as a Service fl fl

  • T&C: https://www.con uent.io/con uent-cloud-promo-disclaimer

Slide 93

Slide 93

s e l c i t r a eep-dive D • • • • • ka? f a K e ds h c n a e r p T A d s i e lat e R What . s v g min a e per r e t e S K t o n o e Z v E ut o h t i w fka a a K k f n a i K s : e t f ante KRa r a u G & ns o i t c a s n a Tr ge a r o t S & ng Processi tals n e m a d Fun • • • • • e c n a m r o f r Kafka Pe a k f a K e v i t ms e t s y S Cloud-na e s ba a t a D g n Streami fka a K e h c a p ls a n Testing A r e t n I s fka’ a K e r o l Exp • • • • • Over 10 Apache K afka 101 Kafka Co nnect 10 1 Kafka Str eams 101 ksqlDB 1 01 Inside ks qlDB hours of • • • • f ree cou rses Spring F ramewo rk and K Building afka Data Pip elines wi Event So th Kafka urcing w ith Kafka Data Me sh 101 Plus: Hands-on Quick Starts and Client Language Guides + Event Streaming Patterns + More fl developer.con uent.io

Slide 94

Slide 94

#EOF @rmoff rmoff.dev/talks youtube.com/rmoff