4 Different Ways of Working with Kafka on Azure

A presentation at Global Azure 2021 in April 2021 in by Hans-Peter Grahsl

Slide 1

Slide 1

FOUR Different Ways of Working with Kafka on Azure @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria

Slide 2

Slide 2

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 2

Slide 3

Slide 3

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 3

Slide 4

Slide 4

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 4

Slide 5

Slide 5

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 5

Slide 6

Slide 6

Diminishing Value of Data @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 6

Slide 7

Slide 7

Diminishing Value of Data @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 7

Slide 8

Slide 8

Diminishing Value of Data @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 8

Slide 9

Slide 9

Diminishing Value of Data @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 9

Slide 10

Slide 10

Hans-Peter Grahsl • based in Graz, Austria • technical trainer at NETCONOMY • independent engineer & consultant • Confluent Community Catalyst • MongoDB Champion @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 10

Slide 11

Slide 11

Stream Processing @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 11

Slide 12

Slide 12

“… data processing that is designed with infinite data sets in mind.” — Tyler Akidau @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 12

Slide 13

Slide 13

☞ messaging ☞ integration ☞ processing plus storage @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 13

Slide 14

Slide 14

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 14

Slide 15

Slide 15

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 15

Slide 16

Slide 16

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 16

Slide 17

Slide 17

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 17

Slide 18

Slide 18

central nervous system @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 18

Slide 19

Slide 19

Kafka with Azure HDInsight @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 19

Slide 20

Slide 20

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 20

Slide 21

Slide 21

HDInsight Services “Family” • large-scale parallel batch processing • general purpose data warehousing • stream processing for IoT • data science & machine learning @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 21

Slide 22

Slide 22

HDInsight Services “Family” @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 22

Slide 23

Slide 23

HDInsight Services “Family” @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 23

Slide 24

Slide 24

HDInsight Apache Kafka® • broker + zookeeper nodes • managed disks / storage • flexible provisioning • 99.9% SLA uptime @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 24

Slide 25

Slide 25

? Client Access ? @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 25

Slide 26

Slide 26

? Client Access ? YES: ! when run in same VNet with VNet peering + IP advertising from on-premises with VPN gateway by using Kafka REST proxy @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 26

Slide 27

Slide 27

Apache Kafka® HDInsight ✅ • main benefits: easy provisioning with flexible pricing open-source Kafka components only supported by Microsoft SLAs @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 27

Slide 28

Slide 28

Apache Kafka® HDInsight ⛔ • main drawbacks: outdated version (Kafka 2.1.1) only “core” Kafka components per default no external broker access @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 28

Slide 29

Slide 29

Azure Event Hubs for Kafka @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 29

Slide 30

Slide 30

Azure Event Hubs • fully-managed PaaS • distributed event ingestion service • supports auto-scaling capabilities • well-integrated with complementary services @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 30

Slide 31

Slide 31

The Big Picture @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 31

Slide 32

Slide 32

Look-alikes “Conceptually, Kafka and Event Hubs are very similar: they’re both partitioned logs built for streaming data, whereby the client controls which part of the retained log it wants to read.” @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 32

Slide 33

Slide 33

Event Hubs for Kafka • overlay on top of Event Hubs • protocol compatible with Kafka 1.0+ • transparent re-use (code + tools) • migration benefits in both ways @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 33

Slide 34

Slide 34

same same but different @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 34

Slide 35

Slide 35

The Virtual Promise… “Update the connection string in configurations to point to the Kafka endpoint exposed by your event hub instead of pointing to your Kafka cluster. Then, you can start streaming events from your applications that use the Kafka protocol into Event Hubs.” @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 35

Slide 36

Slide 36

The devil is in the details @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 36

Slide 37

Slide 37

Unsupported Kafka Features ! idempotent producers & transactions compression of messages size-based retention or log compaction HTTP access via Kafka REST proxy Kafka Streams & ksqlDB connections @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 37

Slide 38

Slide 38

Customer Feedback ! 10 hubs (=topics) per namespace https://bit.ly/3dvQCA1 ! 1 MB message size limit https://bit.ly/3sQDlIN ! no Kafka Streams / ksqlDB connections https://bit.ly/3mi4hyu https://bit.ly/39Huu4s @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 38

Slide 39

Slide 39

Event Hubs for Kafka ✅ • main benefits: hybrid messaging scenarios OOTB auto-inflate for elastic scaling “Azure-native & Kafka-like” experience @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 39

Slide 40

Slide 40

Event Hubs for Kafka ⛔ • main drawbacks: fundamental Kafka (protocol) features missing selected quotas & limits ➜ show-stoppers ? Kafka Streams / ksqlDB clients unsupported @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 40

Slide 41

Slide 41

Confluent Cloud on Azure @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 41

Slide 42

Slide 42

Confluent Cloud • most complete and versatile service • cloud-native with elastic scalability • ready for hybrid & multi-cloud @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 42

Slide 43

Slide 43

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 43

Slide 44

Slide 44

Confluent Cloud hosts fully-managed: • Kafka Connect • 100+ Connectors • ksqlDB • Schema Registry @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 44

Slide 45

Slide 45

Tiered Storage • currently unique to Confluent Cloud • infinite data growth • retention time unlimited ! BUT NO Azure Blob Storage yet @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 45

Slide 46

Slide 46

provisioning options @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 46

Slide 47

Slide 47

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 47

Slide 48

Slide 48

Confluent Cloud on Azure ✅ • main benefits: fully-managed Kafka by its original creators ready for hybrid- / multi-cloud widest & smoothest ecosystem integration @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 48

Slide 49

Slide 49

Confluent Cloud on Azure ⛔ • main drawbacks: compare pricing ➜ not cheap underlying infra not customizable higher degree of vendor dependence @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 49

Slide 50

Slide 50

Kafka on Kubernetes @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 50

Slide 51

Slide 51

Kubernetes • open-source container orchestration • deploying / managing / scaling • CNCF graduate project @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 51

Slide 52

Slide 52

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 52

Slide 53

Slide 53

AKS Azure Kubernetes Service @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 53

Slide 54

Slide 54

remaining challenges: Network Storage Security @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 54

Slide 55

Slide 55

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 55

Slide 56

Slide 56

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 56

Slide 57

Slide 57

@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 57

Slide 58

Slide 58

• Operators (cluster / topic / user) • Kafka Connect + managed Connectors • replication with MirrorMaker • HTTP Bridge for Kafka • Cruise Control cluster balancing @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 58

Slide 59

Slide 59

Kafka on AKS with Strimzi ✅ • main benefits: k8s-native experience with built-in security tweakable / customizable in various ways ease of use for “non-ops-savvy folks” ➜ ME @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 59

Slide 60

Slide 60

Kafka on AKS with Strimzi ⛔ • main drawbacks: Kafka is OUR OWN responsibility k8s knowledge despite “operator magic” no Microsoft support offering @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 60

Slide 61

Slide 61

don’t just roll the dice… @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 61

Slide 62

Slide 62

dig deeper & navigate further! @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria 62

Slide 63

Slide 63

Thanks! Q&A http://bit.ly/kafka-ga21 @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria