@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 1 Apache Kafka's Role in Modern Data Architectures Embrace the Anarchy : Robin Moffatt / Confluent Photo by   Jaak Horn   on   Unsplash

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 2 • Developer Advocate @ Confluent • Working in data & analytics since 2001 • Oracle Developer Champion • Blogging : http://rmoff.net & http://cnfl.io/rmoff
• Twitter: @rmoff

• Geek stuff • Beer & Fried Breakfasts $ whoami https://speakerdeck.com/rmoff/

“ @rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures Apache Kafka is a Streaming Platform

“ @rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures Why do we need a streaming platform?

“ @rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures One of the reasons:
Decoupling

“ @rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures A case in point…Analytics

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 7 Sales DWH Analytics—In the beginning…

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 8 Sales DWH Inventory And then there were more data sources…

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 9 Sales DWH Inventory Batch Transformations … (ETL / ELT)

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 10 Sales DWH Inventory Data Lake Add a Data Lake…

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 11 Sales Inventory Data Lake …or Replace the Data Warehouse

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 12 Sales Inventory Data Lake Still need to do Batch transformations…

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 13 Want your data anytime ! ? Batch is Latency built in by Design

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 14 Microservices Mobile Machine 
 Learning Internet of 
 Things The World has Changed

Photo by  Denys Nevozhai  on  Unsplash

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 15 Photo by   Rosie Fraser   on   Unsplash Lots of new technologies (whether you like it or not)

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 16 App App App App search Hadoop DWH monitoring security MQ MQ cache cache

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 17 KAFKA DWH Hadoop App App App App App App App App

request-response

messaging OR stream processing streaming data pipelines

changelogs

“ @rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures Apache Kafka is a Streaming Platform

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures Three Lenses � 19

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures 01 Messaging
Done Right 02 Scalable Streaming 
 Data Pipelines 03 Foundation for 
 Stream Processing � 20 What is Apache Kafka?

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures Scalability True Storage Real-Time Processing � 21 Lens 1: Messaging Done Right

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 22 Lens 2: Scalable Streaming Data Pipelines

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures Lens 3: Foundation for Stream Processing KSQL is the Streaming SQL Engine for Apache Kafka � 23

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 24 The Streaming Platform

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 25 The Streaming Platform Event-Driven Scalable Decoupled

“ @rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures Bold claim: all your data is event streams

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 27 A Customer Experience

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 28 A Sale

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 29 A Sensor Reading

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 30 An Application
Log Entry

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 31 Databases

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 32 Do you think that’s a table

you are querying?

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 33 The Table Stream Duality Account ID Balance 12345 €50 Account ID Amount 12345

  • €50 12345
  • €25 12345 -€60 Account ID Balance 12345 €75 Account ID Balance 12345 €15 Time Stream Table

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 34 The truth is the log.
The database is a cache of a subset of the log. —Pat Helland Immutability Changes Everything http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf Photo by   Bobby Burch   on   Unsplash

“ @rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures A Brief Look at
Kafka's Technology

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 36 Apache Kafka Reads are a single seek & scan Writes are append only Kafka A Distributed Commit Log . Publish and subscribe to 
 streams of records. Highly scalable, high throughput. 
 Supports transactions. Persisted data. Stream processing. Producer & Consumer APIs Open-source client libraries for numerous languages, to directly integrate with your applications.

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 37 Apache Kafka Orders Table Customers Kafka Streams API Kafka Connect API Reliable and scalable integration of Kafka
with other systems – no coding required. Kafka Streams API Write standard Java applications & microservices 
 to process your data in real-time

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures Declarative Stream Language Processing KSQL is a

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures KSQL is the Streaming SQL Engine for Apache Kafka

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 40 KSQL in Development and Production Interactive KSQL 
 for development and testing Headless KSQL 
 for Production Desired KSQL queries

have been identified REST “Hmm, let me try 
 out this idea...”

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 41 • Log data monitoring, tracking and alerting • syslog data

• Sensor / IoT data CREATE STREAM SYSLOG_INVALID_USERS AS
SELECT HOST, MESSAGE

FROM SYSLOG

WHERE MESSAGE LIKE '%Invalid user%' ; http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting KSQL for Real-Time Monitoring

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 42 CREATE TABLE possible_fraud AS 
 SELECT card_number, count(*) 


FROM authorization_attempts 


WINDOW TUMBLING (SIZE 5 SECONDS) 


GROUP BY card_number 


HAVING count(*) > 3; Identifying patterns or anomalies in real-time data,
surfaced in milliseconds KSQL for Anomaly Detection

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 43 CREATE STREAM vip_actions AS 
 SELECT userid, page, action

FROM clickstream c

LEFT JOIN

users u

ON c.userid = u.user_id 


WHERE u.level = 'Platinum' ; Joining, filtering, and aggregating streams of event data KSQL for Streaming ETL

“ @rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures What Problems does Kafka Solve?

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 45 Streaming Platform “A product was viewed” Hadoop Web
app Event-Centric Thinking

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 46 Event-Centric Thinking Streaming Platform “A product was viewed” Hadoop Web
app mobile
app APIs

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 47 Event-Centric Thinking mobile
app web
app APIs Streaming Platform Hadoop Security Monitoring Rec
engine “A product was viewed”

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 48 Producer Consumer System Availability and Event Buffering

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 49 Producer Consumer System Availability and Event Buffering

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 50 Consumer A Producer 24hr batch extract Varying Latency Requirements / Batch vs Stream

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 51 Producer 24hr batch extract Consumer A Consumer B Varying Latency Requirements / Batch vs Stream

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 52 Producer 24hr batch extract Consumer A Consumer B Varying Latency Requirements / Batch vs Stream

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 53 Producer 24hr batch extract Realtime Consumer A Consumer B Varying Latency Requirements / Batch vs Stream

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 54 Producer 24hr batch extract Realtime Consumer A Consumer B Varying Latency Requirements / Batch vs Stream

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 55 Producer Consumer A 24hr batch extract Realtime Realtime Consumer B Varying Latency Requirements / Batch vs Stream

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 56 Technology & Code/Algo Version Changes Producer Consumer
(v1)

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 57 Technology & Code/Algo Version Changes Producer Consumer
(v1) Consumer
(V2)

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 58 Technology & Code/Algo Version Changes Producer Consumer
(V2)

“ @rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures Architectural Patterns with Apache Kafka

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 60 Photo by   Christopher Burns   on   Unsplash Building for the Future

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 61 Tightly-coupled = Inflexible

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 62 Analytics - Database Offload HDFS / S3 / BigQuery etc RDBMS CDC

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 63 Stream Processing with Apache Kafka and KSQL order events customer customer orders Stream Processing RDBMS CDC

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 64 Real-time Event Stream Enrichment order events customer Stream Processing customer orders RDBMS <y> CDC

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 65 Transform Once, Use Many order events customer Stream Processing customer orders RDBMS <y> New App <x> CDC

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 66 Transform Once, Use Many order events customer Stream Processing customer orders RDBMS <y> HDFS / S3 / etc New App <x> CDC

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 67 Evolve processing from old systems to new Stream Processing RDBMS Existing App CDC New App <x>

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 68 Evolve processing from old systems to new Stream Processing RDBMS Existing App New App <x> New App <y> CDC

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 69 Want your data anytime ! ? Batch is Latency built in by Design You say that like "latency" is a synonym for "evil"

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 70 It's all about the Events!

“ @rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures So…Analytics and Kafka

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 72 The Vision! Vision "One version of the truth"

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 73 The Reality…

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 74 Pragmatism is… "One version of the truth"

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 75 Streaming Platform Stream Processing "One version of the truth"

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 76 Streaming Platform ML App <y> NoSQL Search Graph Stream Processing "One version of the truth"

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures Database Changes Log Events loT Data Web Events … CRM Data Warehouse Database Hadoop Data 
 Integration … Monitoring Analytics Custom Apps Transformations Real-time Applications … Apache Open Source Confluent Open Source Confluent Platform Confluent Platform Apache Kafka ® Core | Connect API | Streams API Data Compatibility Schema Registry Development and Connectivity Clients | Connectors | REST Proxy | CLI Apache Open Source Confluent Open Source SQL Stream Processing KSQL � 77 Confluent Open Source :
Apache Kafka with a bunch of cool stuff! For free!

Confluent Enterprise

Monitoring & Administration Confluent Control Center | Security

Operations Replicator | Auto Data Balancing

Confluent Enterprise

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 78 Free Books! https://www.confluent.io/apache-kafka-stream-processing-book-bundle

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 79 Confluent Streaming Event, Munich http://cnfl.io/streaming-event-munich

@rmoff robin@confluent.io https://www.confluent.io/download/ http://cnfl.io/slack

@rmoff / Embrace the Anarchy—Apache Kafka's Role in Modern Data Architectures � 81 • CDC Spreadsheet

• Blog: No More Silos: How to Integrate your Databases with Apache Kafka and CDC

• #partner-engineering on Slack for questions • BD team (#partners / partners@confluent.io ) can help with introductions on a given sales op Resources #EOF