FOUR
Different Ways of Working with Kafka on Azure @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
Slide 2
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
2
Slide 3
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
3
Slide 4
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
4
Slide 5
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
5
Slide 6
Diminishing Value of Data
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
6
Slide 7
Diminishing Value of Data
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
7
Slide 8
Diminishing Value of Data
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
8
Slide 9
Diminishing Value of Data
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
9
Slide 10
Hans-Peter Grahsl • based in Graz, Austria • technical trainer at NETCONOMY • independent engineer & consultant • Confluent Community Catalyst • MongoDB Champion
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
10
Slide 11
Stream Processing @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
11
Slide 12
“… data processing that is designed with infinite data sets in mind.” — Tyler Akidau @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
12
Slide 13
☞ messaging ☞ integration ☞ processing
plus storage @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
13
Slide 14
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
14
Slide 15
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
15
Slide 16
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
16
Slide 17
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
17
Slide 18
central nervous system @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
18
Slide 19
Kafka with Azure HDInsight @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
19
Slide 20
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
20
Slide 21
HDInsight Services “Family” • large-scale parallel batch processing • general purpose data warehousing • stream processing for IoT • data science & machine learning @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
21
Slide 22
HDInsight Services “Family”
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
22
Slide 23
HDInsight Services “Family”
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
23
? Client Access ?
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
25
Slide 26
? Client Access ? YES: ! when run in same VNet with VNet peering + IP advertising from on-premises with VPN gateway by using Kafka REST proxy @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
26
Slide 27
Apache Kafka® HDInsight ✅ • main benefits: easy provisioning with flexible pricing open-source Kafka components only supported by Microsoft SLAs @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
27
Slide 28
Apache Kafka® HDInsight ⛔ • main drawbacks: outdated version (Kafka 2.1.1) only “core” Kafka components per default no external broker access @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
28
Slide 29
Azure Event Hubs for Kafka @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
29
Slide 30
Azure Event Hubs • fully-managed PaaS • distributed event ingestion service • supports auto-scaling capabilities • well-integrated with complementary services @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
30
Slide 31
The Big Picture
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
31
Slide 32
Look-alikes
“Conceptually, Kafka and Event Hubs are very similar: they’re both partitioned logs built for streaming data, whereby the client controls which part of the retained log it wants to read.”
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
32
Slide 33
Event Hubs for Kafka • overlay on top of Event Hubs • protocol compatible with Kafka 1.0+ • transparent re-use (code + tools) • migration benefits in both ways
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
33
Slide 34
same same but different @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
34
Slide 35
The Virtual Promise… “Update the connection string in configurations to point to the Kafka endpoint exposed by your event hub instead of pointing to your Kafka cluster. Then, you can start streaming events from your applications that use the Kafka protocol into Event Hubs.”
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
35
Slide 36
The devil is in the details @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
36
Slide 37
Unsupported Kafka Features ! idempotent producers & transactions compression of messages size-based retention or log compaction HTTP access via Kafka REST proxy Kafka Streams & ksqlDB connections @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
37
Slide 38
Customer Feedback
! 10 hubs (=topics) per namespace https://bit.ly/3dvQCA1 ! 1 MB message size limit https://bit.ly/3sQDlIN ! no Kafka Streams / ksqlDB connections https://bit.ly/3mi4hyu https://bit.ly/39Huu4s @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
38
Slide 39
Event Hubs for Kafka ✅ • main benefits: hybrid messaging scenarios OOTB auto-inflate for elastic scaling “Azure-native & Kafka-like” experience @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
39
Slide 40
Event Hubs for Kafka ⛔ • main drawbacks: fundamental Kafka (protocol) features missing selected quotas & limits ➜ show-stoppers ? Kafka Streams / ksqlDB clients unsupported @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
40
Slide 41
Confluent Cloud on Azure @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
41
Slide 42
Confluent Cloud • most complete and versatile service • cloud-native with elastic scalability • ready for hybrid & multi-cloud
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
42
Slide 43
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
43
Tiered Storage • currently unique to Confluent Cloud • infinite data growth • retention time unlimited ! BUT NO Azure Blob Storage yet
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
45
Slide 46
provisioning options
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
46
Slide 47
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
47
Slide 48
Confluent Cloud on Azure ✅ • main benefits: fully-managed Kafka by its original creators ready for hybrid- / multi-cloud widest & smoothest ecosystem integration @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
48
Slide 49
Confluent Cloud on Azure ⛔ • main drawbacks: compare pricing ➜ not cheap underlying infra not customizable higher degree of vendor dependence @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
49
Slide 50
Kafka on Kubernetes @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
50
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
52
Slide 53
AKS Azure Kubernetes Service
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
53
Slide 54
remaining challenges:
Network Storage Security @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
54
Slide 55
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
55
Slide 56
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
56
Slide 57
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
57
Slide 58
• Operators (cluster / topic / user) • Kafka Connect + managed Connectors • replication with MirrorMaker • HTTP Bridge for Kafka • Cruise Control cluster balancing
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
58
Slide 59
Kafka on AKS with Strimzi ✅ • main benefits: k8s-native experience with built-in security tweakable / customizable in various ways ease of use for “non-ops-savvy folks” ➜ ME @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
59
Slide 60
Kafka on AKS with Strimzi ⛔ • main drawbacks: Kafka is OUR OWN responsibility k8s knowledge despite “operator magic” no Microsoft support offering @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
60
Slide 61
don’t just roll the dice… @hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
61
Slide 62
dig deeper & navigate further!
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria
62
Slide 63
Thanks!
Q&A http://bit.ly/kafka-ga21
@hpgrahsl | @Azure #GlobalAzure, 16th April 2021, Linz - Austria