Rediscover the known Universe with NASA dataset Pierre Zemb Aurélien Hébert Horacio Gonzalez
Slide 2
Pierre Zemb
@PierreZ Infrastructure Engineer working on Metrics / Kubernetes
Slide 3
Aurélien Hébert
@AurrelH95 Software Engineer and data lover 😍
Slide 4
Horacio Gonzalez 
@LostInBrittany Spaniard lost in Brittany, developer, dreamer and all-around geek
Slide 5
HelloExoWorld
Looking for exoplanets in NASA datasets
Slide 6
Once upon a time...
HelloExoWorld
Slide 7
What not to do if you love astronomy
To live in Brest
Slide 8
Looking for solutions
Mixing passions
Slide 9
Google is your friend...
Let's find a project
Slide 10
Exoplanets?
Planets orbiting stars far away
Slide 11
How do we find them?
The transit method seems the best
Slide 12
Exoplanets detection From theory to practice
Slide 13
The transit method
Credits: NASA’s Goddard Space Flight Center
Slide 14
How do we look for transits?
Image credits : NASA Kepler
Image credits : NASA Tess
Slide 15
Slide 16
Watching the sky
By Carter Roberts [Public domain], via Wikimedia Commons
Slide 17
Kepler image A star : 12*12px
Slide 18
And what kind of data we get?
Pleiades By NASA, ESA, AURA/Caltech, Palomar Observatory. Via Wikimedia Common
Slide 19
Well, that's the problem
Seven stars, seven different profiles
Slide 20
Kinda big data
Over 40 million light curves
Slide 21
Big AND open data
Lots of datasets in #opendata
Slide 22
And we can help with that!
Let's use our tools to analyse the data
Slide 23
Time Series To analyse Kepler datasets
Slide 24
Kepler: spatial Time Series Definition of Time Series:
A series of data points indexed in time order
Slide 25
Time Series Stock Market Analysis Economic Forecasting Budgetary Analysis Process and Quality Control ● Workload Projections ● Census Analysis ● ... ● ● ● ●
Slide 26
Time Series Applications: ● Understanding the data ● Fit a model ○ Monitoring ○ Forecasting
Slide 27
Time Series Stock market Analytics Economic Forecasting
$$$
Study & Research
Slide 28
Time Series Many specific analytical tools: Moving average ARMA (AutoRegressive Moving Average) Multivariate ARMA models ARCH (AutoRegressive Conditional Heteroscedasticity) ● Dynamic time warping ● ... ● ● ● ●
Slide 29
Time Series Specific application of general tools ● ● ● ● ●
Artificial neural networks Hidden Markov model Fourier & Wavelets transforms Entropy encoding ...
Slide 30
Dealing with Time Series
The 3 'v'
Slide 31
Monitoring OVH with Time Series
Slide 32
OVH Metrics A metrics data platform
Slide 33
Tools to deal with Time Series
Many options
Slide 34
Metrics Data Platform
Slide 35
Metrics’ metrics
● 1.5M datapoints/s, 24/7 ● Peaks at ~10M datapoints/s ● 500M unique series
Slide 36
Metrics Data Platform
+
+
Slide 37
Why Warp 10? Warp 10 is a software platform that ● Ingests and stores time series ● Manipulates and analyzes time series
Slide 38
Analytics is the key to success
Fetching data is only the tip of the iceberg
Slide 39
Manipulating Time Series with Warp 10 A true Time Series analysis toolbox ○ Hundreds of functions ○ Manipulation frameworks ○ Analysis workflow
Slide 40
Anatomy of a time series Each time series is composed of:
● Metadata ○ ○
Class name Labels
● Datapoints ○ ○
Timestamp Value
org.nasa.kepler.starlight { keplerId: 52163778 }
Slide 41
Class names and labels ● Class names define the kind of measure ○
Starlight, heart rate, speed…
● Labels define particular traits of a TS ○
Device Id, Device Type, Private User Id...
org.nasa.kepler.starlight { keplerId: 52163778 }
Slide 42
A match made in heaven Warp 10, OVH Metrics and HelloExoWorld
Slide 43
What we have done ● Downloaded and parsed 40 millions of FITS files ● Pushed it to OVH Metrics ● Select a cool subset as training set ● Verified we could find the same planets as NASA
Slide 44
From kepler-11 raw data
Slide 45
To (candidates) exoplanets
Slide 46
Your job
Slide 47
Let's get started! 1. Connect to https://bit.ly/2H7Z5b3 or Connect to WIFI HEW-5G (or HEW) 2. Password is helloexoworld 3. Click on cancel on user password window 4. Open chrome/chromium on 192.168.1.2 Reach step 3.2 and enjoy!
Slide 48
What's next? Where do we go from here?
Slide 49
Only the beginning
Better detection
New import method
Explorer
Deep learning
satellite/star location
Yours?
Slide 50
A growing team
Slide 51
And you!
Join us!
https://helloexo.world
https://xkcd.com/1371/
Slide 52
OVH Metrics Come speak with us about your monitoring and Kubernetes projects!