Game, Set, Match Transforming Live Sports with AI-Driven Commentary
Dunith Danushka (Redpanda Data) Mark Needham (ClickHouse)
Slide 2
Dunith Danushka DevRel @ Redpanda
Mark Needham Product @ ClickHouse
Slide 3
Slide 4
Slide 5
Can we create an AI Copilot to help live text writers?
Slide 6
What are we going to build?
Slide 7
The flow of events
Window queries on ClickHouse and pass the results to OpenAI
Slide 8
What is Redpanda?
Slide 9
Redpanda is a Kafka API compatible streaming data platform ● ● ● ●
Not a Kafka fork! Kafka rewritten in C++ Identical read/write interfaces as Kafka Designed for modern hardware
rpk Redpanda’s command line interface (CLI) utility. Check health of cluster rpk cluster health Create a topic rpk topic create my-topic -p 5 List topics rpk topic list Describe a topic rpk topic describe
Slide 13
Redpanda Demo
Slide 14
What is ClickHouse?
Slide 15
What is ClickHouse?
Open Source
Distributed
Column-Oriented
OLAP Database
Developed since 2009
Files per column
Replication
Analytics use cases
OSS 2016
Vectorized query execution
Sharding
Aggregations
34k+ Github stars
Optimised for aggregations
Multi-master
Visualizations
1k+ contributors
Sorting and indexing
Cross-region
Mostly immutable data
500+ releases
Background merges
Vectorised Query Execution Process rows sequentially
Process chunks of values
Slide 18
Slide 19
Flavours of ClickHouse chdb ClickHouse Local
ClickHouse Server
Slide 20
ClickHouse Demo
Slide 21
What is Streamlit?
Slide 22
What is Streamlit? Streamlit turns data scripts into shareable web apps in minutes. All in pure Python. No front-end experience required.
Slide 23
Streamlit Hello World
Slide 24
Building the AI Copilot
Slide 25
The flow of events
Window queries on ClickHouse and pass the results to OpenAI
Slide 26
Generated Commentary Demo
Slide 27
How do Large Language Models work?
Prompt
LLM
Slide 28
Ingesting context into the prompt Instructions
Prompt Context Sports events feed
Slide 29
Sports events feed
Slide 30
LLM Code
Slide 31
Pulling events from ClickHouse
Slide 32
Generated text events
Slide 33
Retrieval queries: Recent points
Slide 34
Retrieval queries: Last completed game
Slide 35
Serving the live commentary Multiple consumers on the Redpanda topic FastAPI server that renders SSE events Post messages to Twitter API points livetext
Slide 36
Live Commentary Demo
Slide 37
Future Ideas
Slide 38
How can we extend this work? ● ● ● ● ● ●
Automatic summaries every <x> seconds A Copilot that has access to fine-grained statistics Text to SQL so that the writer can ask questions of the data Can we use more batch data? Store the generated commentary for later analysis/tweaking of the prompt Compare the generated commentary with what the analyst chooses to publish
Slide 39
Is it only for sports? Weʼll be focusing on sports, but you could use it for any of the following problems: ● ● ● ● ●
Live auctions Weather updates Local traffic reporting Current location of food delivery …
Any use case where you have events that you want to summarise into a more readable format.
Slide 40
Thanks and Questions github.com/mneedham/devoxx-ai-sports-commentary www.linkedin.com/in/dunithd dunith.medium.com www.linkedin.com/in/markhneedham youtube.com/@LearnDataWithMark