FLANK Stack with Flink for Streaming Use Cases

A presentation at ApacheCon Asia in August 2021 in by Tim Spann

Slide 1

Slide 1

STREAMING FLANK STACK WITH FLINK FOR STREAMING USE CASES Timothy Spann Developer Advocate

Slide 2

Slide 2

FLaNK and FLiP Stacks ● ● ● Apache Flink Apache NiFi Apache Kafka ● ● ● ● Apache Flink Apache Pulsar StreamNative’s Flink Connector for Pulsar Apache +++ Apache projects are the way for all streaming use cases.

Slide 3

Slide 3

FLiP Stack (FLink -integratePulsar) StreamNative’s Flink Connector for Pulsar is the bridge to FLiP your streams at speed. https://hub.streamnative.io/data-processing/pulsar-flink/2.7.0/

Slide 4

Slide 4

WHAT IS APACHE PULSAR? Apache Pulsar is an open source, cloud-native distributed messaging and streaming platform. EVENTS

Slide 5

Slide 5

APACHE PULSAR Enable Geo-Replicated Messaging ● ● ● ● ● ● ● ● ● ● ● ● Pub-Sub Geo-Replication Pulsar Functions Horizontal Scalability Multi-tenancy Tiered Persistent Storage Pulsar Connectors REST API CLI Many clients available Four Different Subscription Types Multi-Protocol Support ○ MQTT ○ AMQP ○ JMS ○ Kafka ○ … https://hub.streamnative.io/

Slide 6

Slide 6

WHAT IS APACHE FLINK? 3B+ data points daily streaming in from 25 million customers running real time machine learning prediction USE CASE Streaming real-time data pipelines that need to handle complex stream or batch data event processing, analytics, and/or support event-driven applications event time window job with state and connectors for basic writes to HDFS, Pulsar, Kafka. Need Event-at-a-time/microbatch, stateful/stateless operations, and exactly once or at least once Processing TECHNOLOGY Flink performs compute at in-memory speed at any scale Flink parses SQL using Apache Calcite, which supports standard ANSI SQL Flink runs standalone, on YARN and Kubernetes Flink

Slide 7

Slide 7

Flink SQL ● ● ● ● ● ● ● ● Streaming Analytics Continuous SQL Continuous ETL Kafka, Pulsar and More… Complex Event Processing Standard SQL Powered by Apache Calcite Deployed Apache Flink Apps on YARN Scalable Stream Processing https://www.datainmotion.dev/2021/04/cloudera-sql-stream-builder-ssb-updated.html 7

Slide 8

Slide 8

Multiinges t Multiinges t ALL DATA - ANYTIME - ANYWHERE - MULTI-CLOUD MULTI-PROTOCOL Multi-ingest Merge Priority

Slide 9

Slide 9

DEEPER CONTENT ● ● https://www.datainmotion.dev/2020/10/running-flink-sql-against-kafka-using.html https://www.datainmotion.dev/2020/10/top-25-use-cases-of-cloudera-flow.html ● ● ● ● ● https://github.com/tspannhw/EverythingApacheNiFi https://github.com/tspannhw/CloudDemo2021 https://github.com/tspannhw/StreamingSQLExamples https://github.com/tspannhw/SpeakerProfile/blob/main/2021/talks/UsingFLaNKStackEdgetspann2021.pdf https://www.linkedin.com/pulse/2021-schedule-tim-spann/ 9

Slide 10

Slide 10

THANK YOU QUESTIONS? @PaasDev timothyspann https://www.pulsardeveloper.com/