On Data: Sky above, sand below, peace within

A presentation at HackConf 2021 in October 2021 in Sofia, Bulgaria by Elena Georgieva

Slide 1

Slide 1

@geoellena On Data: Sky above, sand below, peace within October 2021 Elena Georgieva

Slide 2

Slide 2

@geoellena ● How did we get here? ● Modern data paradigms ● FT’s approach

Slide 3

Slide 3

@geoellena 1988 2010 2020

Slide 4

Slide 4

@geoellena

Slide 5

Slide 5

Data warehouse Photo: https://www.pexels.com/@chanaka-318741 @geoellena

Slide 6

Slide 6

Data warehouse @geoellena

Slide 7

Slide 7

At the same time… Source: bnr.bg @geoellena

Slide 8

Slide 8

At the same time… @geoellena

Slide 9

Slide 9

At the same time… @geoellena

Slide 10

Slide 10

At the same time… @geoellena

Slide 11

Slide 11

At the same time… Source: CoderDojo Bulgaria @geoellena

Slide 12

Slide 12

@geoellena 1988 2010 2020

Slide 13

Slide 13

@geoellena Source: wikipedia.org

Slide 14

Slide 14

@geoellena Source: ft.com

Slide 15

Slide 15

Data Lake @geoellena

Slide 16

Slide 16

Data Lake @geoellena Rila lakes, Bulgaria Photo: https://www.pexels.com/@bkrustev

Slide 17

Slide 17

Data Lake Source: https://vdocument.in/making-sense-of-big-data.html @geoellena

Slide 18

Slide 18

@geoellena 1988 2010 2020

Slide 19

Slide 19

@geoellena ● How did we get here? ● Modern data paradigms ● FT’s approach

Slide 20

Slide 20

@geoellena Best of both worlds

Slide 21

Slide 21

@geoellena https://www.pexels.com/@riciardus

Slide 22

Slide 22

@geoellena https://www.pexels.com/@rakicevic-nenad-233369

Slide 23

Slide 23

Data Lakehouse @geoellena Breitenwang, Austria https://www.pexels.com/@lucasallmann

Slide 24

Slide 24

@geoellena Source: https://databricks.com/ Article: What is a Lakehouse? by Ben Lorica, Michael Armbrust, Ali Ghodsi, Reynold Xin and Matei Zaharia

Slide 25

Slide 25

@geoellena Criteria Schema enforcement and data governance Support for diverse data types Support for diverse workloads Low latency Transaction support Separated storage and compute Openness Easy access to data Data Warehouse Data Lake Lakehouse

Slide 26

Slide 26

@geoellena Data Democratisation Photos: https://www.pexels.com/@anna-nekrashevich

Slide 27

Slide 27

@geoellena How we can achieve Data Democratisation? ● Self-service dashboarding ● Central data stores in the Cloud ● Data federation ● Data virtualization

Slide 28

Slide 28

Data Science Photos: https://www.pexels.com/@thisisengineering @geoellena

Slide 29

Slide 29

Why it is not that new as a concept? Photos: https://www.pexels.com/@pixabay @geoellena

Slide 30

Slide 30

@geoellena Source: ft.com

Slide 31

Slide 31

@geoellena Source: btvnovinite.bg

Slide 32

Slide 32

Hacker, Statistician, Domain expert Source: Drew Conway @geoellena

Slide 33

Slide 33

@geoellena ● How did we get here? ● Modern data paradigms ● FT’s approach

Slide 34

Slide 34

Now… @geoellena

Slide 35

Slide 35

@geoellena 2008 2014 2016 2019 The Modest start Warehouse in the Clouds Real-time data Warehouse with Lake

Slide 36

Slide 36

@geoellena The FT Data Platform 2019

Slide 37

Slide 37

@geoellena Serving Layer

Slide 38

Slide 38

@geoellena The challenges with Data Lake ● Slowing down data delivery ● Management overhead ● Governance and slow backfills

Slide 39

Slide 39

@geoellena 2008 2014 2016 2019 2020+ The Modest start Warehouse in the Clouds Real-time data Warehouse with a Lake Lakehouse

Slide 40

Slide 40

@geoellena

Slide 41

Slide 41

@geoellena

Slide 42

Slide 42

@geoellena

Slide 43

Slide 43

@geoellena

Slide 44

Slide 44

@geoellena

Slide 45

Slide 45

@geoellena

Slide 46

Slide 46

@geoellena

Slide 47

Slide 47

Active Projects ● Self-service ● Separating storage from compute ● MLOps and data time travels @geoellena

Slide 48

Slide 48

@geoellena

Slide 49

Slide 49

Sky above Photo: https://www.pexels.com/@szaboviktor @geoellena

Slide 50

Slide 50

Sand Below Photo: https://www.pexels.com/@negativespace @geoellena

Slide 51

Slide 51

Peace within Photo: https://www.pexels.com/@pixabay @geoellena

Slide 52

Slide 52

twitter @geoellena THANK YOU bit.ly/ftcareers