Rediscover the known Universe with NASA dataset

A presentation at SnowCamp 2019 in January 2019 in Grenoble, France by Horacio Gonzalez

Slide 1

Slide 1

Rediscover the known Universe with NASA dataset Horacio Gonzalez @LostInBrittany Emmanuel Feller @moyowi

Slide 2

Slide 2

Horacio Gonzalez @LostInBrittany Spaniard lost in Brittany, developer, dreamer and all-around geek #HelloExoWorld @LostInBrittany @moyowi

Slide 3

Slide 3

Emmanuel Feller @moyowi Développeur passionné #HelloExoWorld @LostInBrittany @moyowi

Slide 4

Slide 4

Pierre & Aurélien we ♥ you #HelloExoWorld @LostInBrittany @moyowi

Slide 5

Slide 5

HelloExoWorld Looking for exoplanets in NASA datasets #HelloExoWorld @LostInBrittany @moyowi

Slide 6

Slide 6

HelloExoWorld Once upon a time… #HelloExoWorld @LostInBrittany @moyowi

Slide 7

Slide 7

What not to do if you love astronomy To live in Brest #HelloExoWorld @LostInBrittany @moyowi

Slide 8

Slide 8

Looking for solutions Computer stuff Astronomy Mixing passions #HelloExoWorld @LostInBrittany @moyowi

Slide 9

Slide 9

Google is your friend… Let’s find a project #HelloExoWorld @LostInBrittany @moyowi

Slide 10

Slide 10

Exoplanets? Planets orbiting stars far away #HelloExoWorld @LostInBrittany @moyowi

Slide 11

Slide 11

How do we find them? The transit method seems the best #HelloExoWorld @LostInBrittany @moyowi

Slide 12

Slide 12

Exoplanets detection From theory to practice #HelloExoWorld @LostInBrittany @moyowi

Slide 13

Slide 13

The transit method Credits: NASA’s Goddard Space Flight Center #HelloExoWorld @LostInBrittany @moyowi

Slide 14

Slide 14

How do we look for transits? Image credits : NASA Image credits : NASA Kepler Tess #HelloExoWorld @LostInBrittany @moyowi

Slide 15

Slide 15

#HelloExoWorld @LostInBrittany @moyowi

Slide 16

Slide 16

Watching the sky By Carter Roberts [Public domain], via Wikimedia Commons #HelloExoWorld @LostInBrittany @moyowi

Slide 17

Slide 17

Kepler image A star : 12*12px #HelloExoWorld @LostInBrittany @moyowi

Slide 18

Slide 18

And what kind of data we get? Pleiades By NASA, ESA, AURA/Caltech, Palomar Observatory. Via Wikimedia Common #HelloExoWorld @LostInBrittany @moyowi

Slide 19

Slide 19

Well, that’s the problem Seven stars, seven different profiles #HelloExoWorld @LostInBrittany @moyowi

Slide 20

Slide 20

Kinda big data Over 40 million light curves #HelloExoWorld @LostInBrittany @moyowi

Slide 21

Slide 21

Big AND open data Lots of datasets in #opendata #HelloExoWorld @LostInBrittany @moyowi

Slide 22

Slide 22

And we can help with that! Let’s use our tools to analyse the data #HelloExoWorld @LostInBrittany @moyowi

Slide 23

Slide 23

Time Series To analyse Kepler datasets #HelloExoWorld @LostInBrittany @moyowi

Slide 24

Slide 24

Kepler: spatial Time Series Definition of Time Series: A series of data points indexed in time order #HelloExoWorld @LostInBrittany @moyowi

Slide 25

Slide 25

Time Series ● ● ● ● ● ● ● Stock Market Analysis Economic Forecasting Budgetary Analysis Process and Quality Control Workload Projections Census Analysis … #HelloExoWorld @LostInBrittany @moyowi

Slide 26

Slide 26

Time Series Applications: ▪ Understanding the data ▪ Fit a model – – Monitoring Forecasting #HelloExoWorld @LostInBrittany @moyowi

Slide 27

Slide 27

Time Series Stock market Analytics Economic Forecasting $$$ Study & Research #HelloExoWorld @LostInBrittany @moyowi

Slide 28

Slide 28

Time Series Many specific analytical tools: ● Moving average ● ARMA (AutoRegressive Moving Average) ● Multivariate ARMA models ● ARCH (AutoRegressive Conditional Heteroscedasticity) ● Dynamic time warping ● … #HelloExoWorld @LostInBrittany @moyowi

Slide 29

Slide 29

Time Series Specific application of general tools ● Artificial neural networks ● Hidden Markov model ● Fourier & Wavelets transforms ● Entropy encoding ● … #HelloExoWorld @LostInBrittany @moyowi

Slide 30

Slide 30

Dealing with Time Series The 3 ‘v’ #HelloExoWorld @LostInBrittany @moyowi

Slide 31

Slide 31

Monitoring OVH with Time Series #HelloExoWorld @LostInBrittany @moyowi

Slide 32

Slide 32

OVH Metrics A metrics data platform #HelloExoWorld @LostInBrittany @moyowi

Slide 33

Slide 33

Tools to deal with Time Series Many options #HelloExoWorld @LostInBrittany @moyowi

Slide 34

Slide 34

Metrics Data Platform #HelloExoWorld @LostInBrittany @moyowi

Slide 35

Slide 35

Metrics’ metrics ● 1.5M datapoints/s, 24/7 ● Peaks at ~10M datapoints/s ● 500M unique series #HelloExoWorld @LostInBrittany @moyowi

Slide 36

Slide 36

Metrics Data Platform + + #HelloExoWorld @LostInBrittany @moyowi

Slide 37

Slide 37

Why Warp 10? Warp 10 is a software platform that ● Ingests and stores time series ● Manipulates and analyzes time series #HelloExoWorld @LostInBrittany @moyowi

Slide 38

Slide 38

Analytics is the key to success Fetching data is only the tip of the iceberg #HelloExoWorld @LostInBrittany @moyowi

Slide 39

Slide 39

Manipulating Time Series with Warp 10 A true Time Series analysis toolbox ● ● ● Hundreds of functions Manipulation frameworks Analysis workflow #HelloExoWorld @LostInBrittany @moyowi

Slide 40

Slide 40

Anatomy of a time series Each time series is composed of: org.nasa.kepler.starlight { keplerId: 52163778 } Metadata ▪ ▪ Class name Labels Datapoints ▪ ▪ Timestamp Value #HelloExoWorld @LostInBrittany @moyowi

Slide 41

Slide 41

Class names and labels Class names define the kind of measure ▪ Starlight, heart rate, speed… org.nasa.kepler.starlight { keplerId: 52163778 } Labels define particular traits of a TS ▪ Device Id, Device Type, Private User Id… #HelloExoWorld @LostInBrittany @moyowi

Slide 42

Slide 42

A match made in heaven Warp 10, OVH Metrics and HelloExoWorld #HelloExoWorld @LostInBrittany @moyowi

Slide 43

Slide 43

What we have done ● ● ● ● Downloaded and parsed 40 millions of FITS files Pushed it to OVH Metrics Select a cool subset as training set Verified we could find the same planets as NASA #HelloExoWorld @LostInBrittany @moyowi

Slide 44

Slide 44

From kepler-11 raw data #HelloExoWorld @LostInBrittany @moyowi

Slide 45

Slide 45

To (candidates) exoplanets #HelloExoWorld @LostInBrittany @moyowi

Slide 46

Slide 46

Your job #HelloExoWorld @LostInBrittany @moyowi

Slide 47

Slide 47

Let’s get started!

  1. Connect to https://bit.ly/2H7Z5b3 2. Enjoy! #HelloExoWorld @LostInBrittany @moyowi

Slide 48

Slide 48

What’s next? Where do we go from here? #HelloExoWorld @LostInBrittany @moyowi

Slide 49

Slide 49

Only the beginning Better detection New import method Explorer Deep learning satellite/star location #HelloExoWorld Yours? @LostInBrittany @moyowi

Slide 50

Slide 50

A growing team #HelloExoWorld @LostInBrittany @moyowi

Slide 51

Slide 51

And you! Join us! https://helloexo.world https://xkcd.com/1371/ #HelloExoWorld @LostInBrittany @moyowi

Slide 52

Slide 52

OVH Platform Come speak with us about your kubernetes, time-series or observability projects and OVH Platform #HelloExoWorld @LostInBrittany @moyowi

Slide 53

Slide 53

Thank you, dear sponsors! #HelloExoWorld @LostInBrittany @moyowi

Slide 54

Slide 54

Thank you! #HelloExoWorld @LostInBrittany @moyowi