Game, Set, Match Transforming Live Sports with AI-Driven Commentary

A presentation at Devoxx UK in May 2024 in London, UK by Mark Needham

Slide 1

Slide 1

Game, Set, Match Transforming Live Sports with AI-Driven Commentary Dunith Danushka (Redpanda Data) Mark Needham (ClickHouse)

Slide 2

Slide 2

Dunith Danushka DevRel @ Redpanda Mark Needham Product @ ClickHouse

Slide 3

Slide 3

Slide 4

Slide 4

Slide 5

Slide 5

Can we create an AI Copilot to help live text writers?

Slide 6

Slide 6

What are we going to build?

Slide 7

Slide 7

The flow of events Window queries on ClickHouse and pass the results to OpenAI

Slide 8

Slide 8

What is Redpanda?

Slide 9

Slide 9

Redpanda is a Kafka API compatible streaming data platform ● ● ● ● Not a Kafka fork! Kafka rewritten in C++ Identical read/write interfaces as Kafka Designed for modern hardware

Slide 10

Slide 10

Simple to deploy, use and manage Single binary Kafka-compatible APIs Easy Day 2 Ops Dev-friendly interface © 2023 REDPANDA DATA

Slide 11

Slide 11

Slide 12

Slide 12

rpk Redpanda’s command line interface (CLI) utility. Check health of cluster rpk cluster health Create a topic rpk topic create my-topic -p 5 List topics rpk topic list Describe a topic rpk topic describe

Slide 13

Slide 13

Redpanda Demo

Slide 14

Slide 14

What is ClickHouse?

Slide 15

Slide 15

What is ClickHouse? Open Source Distributed Column-Oriented OLAP Database Developed since 2009 Files per column Replication Analytics use cases OSS 2016 Vectorized query execution Sharding Aggregations 34k+ Github stars Optimised for aggregations Multi-master Visualizations 1k+ contributors Sorting and indexing Cross-region Mostly immutable data 500+ releases Background merges

Slide 16

Slide 16

Row Oriented vs Column Oriented Row Oriented location ts temperature wind_speed humidity Aberystwyth 2022-01-01 00:00:00 14 21 79 Blackpool 2022-01-01 00:20:00 13 9 82 Column Oriented location ts temperature wind_speed humidity Aberystwyth 2022-01-01 00:00:00 14 21 79 Blackpool 2022-01-01 00:20:00 13 9 82

Slide 17

Slide 17

Vectorised Query Execution Process rows sequentially Process chunks of values

Slide 18

Slide 18

Slide 19

Slide 19

Flavours of ClickHouse chdb ClickHouse Local ClickHouse Server

Slide 20

Slide 20

ClickHouse Demo

Slide 21

Slide 21

What is Streamlit?

Slide 22

Slide 22

What is Streamlit? Streamlit turns data scripts into shareable web apps in minutes. All in pure Python. No front-end experience required.

Slide 23

Slide 23

Streamlit Hello World

Slide 24

Slide 24

Building the AI Copilot

Slide 25

Slide 25

The flow of events Window queries on ClickHouse and pass the results to OpenAI

Slide 26

Slide 26

Generated Commentary Demo

Slide 27

Slide 27

How do Large Language Models work? Prompt LLM

Slide 28

Slide 28

Ingesting context into the prompt Instructions Prompt Context Sports events feed

Slide 29

Slide 29

Sports events feed

Slide 30

Slide 30

LLM Code

Slide 31

Slide 31

Pulling events from ClickHouse

Slide 32

Slide 32

Generated text events

Slide 33

Slide 33

Retrieval queries: Recent points

Slide 34

Slide 34

Retrieval queries: Last completed game

Slide 35

Slide 35

Serving the live commentary Multiple consumers on the Redpanda topic FastAPI server that renders SSE events Post messages to Twitter API points livetext

Slide 36

Slide 36

Live Commentary Demo

Slide 37

Slide 37

Future Ideas

Slide 38

Slide 38

How can we extend this work? ● ● ● ● ● ● Automatic summaries every <x> seconds A Copilot that has access to fine-grained statistics Text to SQL so that the writer can ask questions of the data Can we use more batch data? Store the generated commentary for later analysis/tweaking of the prompt Compare the generated commentary with what the analyst chooses to publish

Slide 39

Slide 39

Is it only for sports? Weʼll be focusing on sports, but you could use it for any of the following problems: ● ● ● ● ● Live auctions Weather updates Local traffic reporting Current location of food delivery … Any use case where you have events that you want to summarise into a more readable format.

Slide 40

Slide 40

Thanks and Questions github.com/mneedham/devoxx-ai-sports-commentary www.linkedin.com/in/dunithd dunith.medium.com www.linkedin.com/in/markhneedham youtube.com/@LearnDataWithMark