Building an End-to-End Analytics Pipeline with PyFlink

A presentation at Data Science UA by Marta Paes

Stream processing has fundamentally changed the way we build and think about data pipelines — but the technologies that unlock its value haven’t always been friendly to non-Java/Scala developers.

Flink has recently introduced PyFlink, allowing developers to tap into streaming data in real-time with the flexibility of Python and its wide ecosystem for data analytics and Machine Learning. In this talk, we’ll explore the basics of PyFlink and showcase how developers can make use of familiar tools like interactive notebooks to unleash the full power of an advanced stream processor like Flink.