One Data Pipeline to Rule Them All

A presentation at PyCon in in Portland, OR, USA by Sam Kitajima-Kimbrel

There are myriad data storage systems available for every use case imaginable, but letting application teams choose storage engines independently can lead to duplicated efforts and wheel reinvention. This talk will explore how to build a reusable data pipeline based on Kafka to support multiple applications, datasets, and use cases including archival, warehousing and analytics, stream and batch processing, and low-latency "hot" storage.

Video