7 Reasons to use Apache Flink for your IoT Project: How We Built a Real-time Asset Tracking System

A presentation at ApacheCon Europe 2019 in October 2019 in Berlin, Germany by Marta Paes

Slide 1

Slide 1

7 REASONS TO USE APACHE FLINK for your IoT Project How We Built a Real-time Asset Tracking System Marta Paes Moreira & Jakub Piasecki | ApacheCon Europe | October 2019

Slide 2

Slide 2

Marta Paes Moreira Jakub Piasecki @morsapaes (Twitter) morsapaes (LinkedIn) @jakubpiasecki (Twitter) jakubpiasecki (LinkedIn) Product Marketing Manager Ververica (formerly data Artisans) Director of Technology / Software Architect Freeport Metrics

Slide 3

Slide 3

Agenda What is special about IoT data and applications? Why is Apache Flink a good fit for your IoT use case? Building an Asset Tracking System with Flink More IoT use cases for Flink

Slide 4

Slide 4

What is so special about IoT data and applications?

Slide 5

Slide 5

How much IoT data have you generated this month?

Slide 6

Slide 6

Slide 7

Slide 7

How much IoT data have you generated this week?

Slide 8

Slide 8

Slide 9

Slide 9

How much IoT data have you generated today?

Slide 10

Slide 10

Slide 11

Slide 11

IoT Data Everyday life Smart home devices, wearables, smart vehicles, mobile phone sensors Industrial use cases Measuring solar farm data generation, tracking airport luggage, sensors at production line More and more IoT use cases, sensors and devices Predictive maintenance Smart buildings Smart cities, utilities Cashier-less stores Healthcare wearables Machine sensors, GPS, RFID tags, beacons, cameras, microphones

Slide 12

Slide 12

IoT Data Properties Machine generated data can be huge 500ZB of data produced in 2019 driven by IOT [Cisco’s research] (1ZB = 10^21 B) More data produced than can be stored in data centers 1 GPS sensor reporting location every 5s produces ~6 mln events/year Transmission latencies / out-of-order data Mobile network connectivity Gateway devices Buffering Failures Continuous flow

Slide 13

Slide 13

Applications need to react quickly User don’t want and sometimes cannot wait IoT data is gathered in the real world Applications need to react quickly to trigger reactions in the real world Send an alert when a valuable asset is leaving a building Stop a machine when sensor data indicates an expensive failure in the near future Steering city traffic in real time

Slide 14

Slide 14

Apache Flink The Perfect Fit for IoT Data

Slide 15

Slide 15

What is Apache Flink?

Slide 16

Slide 16

Flink is a distributed system for stateful stream processing. Data can be processed in parallel with low latency.

Slide 17

Slide 17

Reasons to Use It

Slide 18

Slide 18

1 Events are processed with low latency* *Latency depends on many factors

Slide 19

Slide 19

State is always locally maintained and accessed ● In-memory state backend Flink’s network stack is optimized for low latency (and high throughput) ● Credit-based flow control mechanism Support for asynchronous and incremental checkpoints ● Applications can be scaled to many machines to reduce machine load

Slide 20

Slide 20

2 Applications scale to huge data volumes

Slide 21

Slide 21

Streams are partitioned to distribute data and computations Flink applications can run on 10000+ of cores ● Flink applications can process 5 trillion events per day (~57 million events/ second) Application state is partitioned (similar to key-value stores) ● ● Flink application can consistently maintain 20+TB of state Application state can be stored on disk (if necessary) Applications can be scaled in and out

Slide 22

Slide 22

3 Low quality IoT data is handled very well

Slide 23

Slide 23

Delayed or out-of-order data is correctly handled due to event-time processing ● ● ● Watermarks control the logical time of an application Applications can choose to wait for, redirect, or discard delayed data Trade-off result completeness and latency Inaccurate or imprecise sensor data (GPS, temperature, …) is easily smoothed ● Built-in window functions make smoothing really simple

Slide 24

Slide 24

4 Failure recovery and highly-available setup

Slide 25

Slide 25

Checkpointing guarantees state consistency in case of failures ● ● Applications are automatically recovered with exactly-once state guarantees Output consistency (aka end-to-end exactly-once) can be provided as well* Resource managers restart failed Flink processes ● Kubernetes, YARN, Mesos *Available guarantees depend on sink system

Slide 26

Slide 26

5 Define and match patterns on event streams

Slide 27

Slide 27

Define REGEX-like patterns and match them on event streams ● ● ● “Identify orders that are not completed in time” Java & Scala Complex Event Processing (CEP) library based on DataStream API Implementation of SQL:2016’s MATCH_RECOGNIZE clause Combine pattern matching with data analytics ● ● “Count how often a pattern matched within 5 minutes” “Match pattern if an two events occurred more than 10 times within 5 minutes” Perfect match for many IoT use cases

Slide 28

Slide 28

6 Well-connected with external systems

Slide 29

Slide 29

Large library of community maintained connectors ● ● ● Messaging systems: Apache Kafka, AWS Kinesis, Apache Pulsar, RabbitMQ, … Databases & KV Stores: Cassandra, Elasticsearch, JDBC Files: HDFS, S3, Parquet, ORC, Avro

Slide 30

Slide 30

7 Data streaming is (conceptually) simple

Slide 31

Slide 31

Stream processing is the natural way of handling IoT events ● Events are processed one by one Flink’s DataStream API makes stream processing easy ● ● ● Gives access to core ingredients: State & Time Offers a lot of built-in operators (Windows, Joins, CEP) Applications can be parallelized and scaled to any size

Slide 32

Slide 32

ItemAware An Asset Tracking System Built with Flink

Slide 33

Slide 33

Freeport Metrics | Our Background B2B Digital products development Strong data processing and data analytics roots We have worked with ● ● ● ● Solar and wind farms data Inventory and warehouse systems Automated retail kiosks Sustainability reporting (e.g. carbon footprint) Before Flink we relied mostly on a mix of traditional ETL tools and our own custom solutions

Slide 34

Slide 34

Asset Tracking System: The Challenge

Slide 35

Slide 35

Example use cases ● ● ● Inventory management Tracking shipments in warehouses Hospital dashboards for families in waiting rooms Multiple data sources ● ● ● RFID tags & antennas Hand-held barcode and RFID scanners User mobile and web applications 100s to 100s of thousands assets tracked in real time

Slide 36

Slide 36

Event Time Ordering data from multiple sources by event time ● ● ● RFID Antennas Handheld RFID scanners Web and Mobile UI Windowing - cleaning data e.g. overlapping antennas

Slide 37

Slide 37

State Partitioning - all events for a single tag can be grouped together Parallelizable by default - critical for performance Complex event processing and business process modeling ● If a patient ○ entered the waiting room ○ and then entered the doctor’s office ○ and then entered the recovery area and 10 minutes passed ○ => the procedure is over

Slide 38

Slide 38

Dev Experience We achieved our goals - functionality and performance We could focus on logic Flexibility - high level abstractions to low level functions Good integrations with external tools - Kafka Challenges ● ● ● New ‘way of thinking’ - not just another framework Adding new features requires careful planning for all affected parts of the system 3 years ago, learning materials were limited - not the case anymore

Slide 39

Slide 39

More IoT Use Cases for Flink

Slide 40

Slide 40

Data-driven agriculture @John Deere Photo by David Wright (https://flic.kr/p/ojF3Ai), (CC BY 2.0)

Slide 41

Slide 41

John Deere ● ● Manufacturer of machines for agriculture, construction, and forestry (Fortune 500) Runs a data platform to provide data services for farmers Data-driven agriculture ● ● The data platform receives and processes 4 Billion geospatial events per day ○ A single planting machine produces 2400 sensor measurements per second ○ Low connectivity in rural areas, spiky data reception Data is rasterized, time-windowed, and stored in a data lake for later analysis & visualization Setup ● ● Kinesis -> Flink -> S3 / DynamoDB Flink on AWS EMR https://sf-2019.flink-forward.org/conference-program#how-john-deere-uses-flink-to-process-millions-of-sensor-measurements-per-second

Slide 42

Slide 42

Living Maps @Here Photo by Jorge Franganillo (https://flic.kr/p/mpnTqF), (CC BY 2.0)

Slide 43

Slide 43

Here ● ● Provides mapping and location data and related services Open location platform with a data marketplace and Flink as a service Building living maps ● ● ● Enhance maps with live data: slippery roads, variable signs, accidents, parking spots Process IoT events from car sensors: GPS, slope, wheel, side distance, sign detectors Flink application enriches events with location context https://berlin-2018.flink-forward.org/conference-program#real-time-processing-of-noisy-data-from-connected-vehicles

Slide 44

Slide 44

Fleet Management @Trackunit Photo by MGI Construction Corp. (https://flic.kr/p/EEux6C), (CC BY-ND 2.0)

Slide 45

Slide 45

Trackunit ● ● ● Provides telematics solutions and fleet management systems for the construction industry Offers IoT services to optimize the daily operations of its customers Creates both hardware and software solutions within telematics and industrial IoT Fleet management in the construction industry ● ● Pipeline to process telematic IoT data Track locations, idleness and usage patterns, maintenance intervals of machines Setup ● ● Kinesis -> Flink -> Cassandra Flink on AWS EMR https://www.ververica.com/blog/trackunit-leverages-flink-industrial-iot

Slide 46

Slide 46

Bushfire Detection @AWS blog (demo) Photo by bertknot (https://flic.kr/p/dwVX5b), (CC BY-SA 2.0)

Slide 47

Slide 47

Not a real-world application ● Blog post describes the use case and how to run the demo on AWS Bushfire detection ● ● ● Demo assumes a multi-hop wireless mesh network of long-lived battery powered IoT sensors ○ Sensors propagate the temperature readings to neighbors until a gateway is reached Demo uses Flink’s CEP library to detect bushfire patterns from incoming temperature events A real-time heat-map visualization shows the area under surveillance Setup ● ● Kinesis -> Flink -> Amazon ES / Kibana Flink on AWS EMR https://aws.amazon.com/blogs/big-data/real-time-bushfire-alerting-with-complex-event-processing-in-apache-flink-on-amazon-emr-and -iot-sensor-network/

Slide 48

Slide 48

Conclusion

Slide 49

Slide 49

Flink meets the high demands of IoT applications.

Slide 50

Slide 50

Performance & Failure Handling ● ● Handles latency requirements and huge data volumes Automatically recovers from failures and guarantees consistency Application logic ● ● ● High-level API to ease implementation of complex logic Low-level API to implement any functionality Event-time processing to control timing issues of IoT events Flink is being used for many IoT projects. Try Flink for your next IoT project as well!

Slide 51

Slide 51

THANK YOU!