Streaming ETL in Practice with Oracle, Apache Kafka, and KSQL

A presentation at KScope 19 in in Seattle, WA, USA by Robin Moffatt

Have you ever thought that you needed to be a programmer to do stream processing and build streaming data pipelines? Think again!

Companies new and old are all recognizing the importance of a low-latency, scalable, fault-tolerant data backbone in the form of the Apache Kafka streaming platform. With Kafka, developers can integrate multiple sources and systems, which enables low-latency analytics, event-driven architectures, and the population of multiple downstream systems. These data pipelines can be built using configuration alone.

In this talk, we’ll see how easy it is to stream data from a database such as Oracle into Kafka using CDC and Kafka Connect. In addition, we’ll use KSQL to filter, aggregate, and join it to other data, and then stream this from Kafka out into multiple targets such as Elasticsearch and S3. All of this can be accomplished without a single line of code!

Why should Java geeks have all the fun?

Resources

The following resources were mentioned during the presentation or are useful additional information.

Buzz and feedback

Here’s what was said about this presentation on social media.