A presentation at SnowCamp 2019 in in Grenoble, France by Horacio Gonzalez
Rediscover the known Universe with NASA dataset Horacio Gonzalez @LostInBrittany Emmanuel Feller @moyowi
Horacio Gonzalez @LostInBrittany Spaniard lost in Brittany, developer, dreamer and all-around geek
Emmanuel Feller @moyowi Développeur passionné
HelloExoWorld Looking for exoplanets in NASA datasets
HelloExoWorld Once upon a time…
What not to do if you love astronomy To live in Brest
Looking for solutions Computer stuff Astronomy Mixing passions
Google is your friend… Let's find a project
Exoplanets? Planets orbiting stars far away
How do we find them? The transit method seems the best
Exoplanets detection From theory to practice
The transit method Credits: NASA's Goddard Space Flight Center
How do we look for transits? Image credits : NASA Image credits : NASA Kepler Tess
Watching the sky By Carter Roberts [Public domain], via Wikimedia Commons
Kepler image A star : 12*12px
And what kind of data we get? Pleiades By NASA, ESA, AURA/Caltech, Palomar Observatory. Via Wikimedia Common
Well, that's the problem Seven stars, seven different profiles
Kinda big data Over 40 million light curves
Big AND open data Lots of datasets in #opendata
And we can help with that! Let's use our tools to analyse the data
Time Series To analyse Kepler datasets
Kepler: spatial Time Series Definition of Time Series: A series of data points indexed in time order
Time Series ● ● ● ● ● ● ● Stock Market Analysis Economic Forecasting Budgetary Analysis Process and Quality Control Workload Projections Census Analysis …
Time Series Applications: ▪ Understanding the data ▪ Fit a model – – Monitoring Forecasting
Time Series Stock market Analytics Economic Forecasting $$$ Study & Research
Time Series Many specific analytical tools: ● Moving average ● ARMA (AutoRegressive Moving Average) ● Multivariate ARMA models ● ARCH (AutoRegressive Conditional Heteroscedasticity) ● Dynamic time warping ● …
Time Series Specific application of general tools ● Artificial neural networks ● Hidden Markov model ● Fourier & Wavelets transforms ● Entropy encoding ● …
Dealing with Time Series The 3 'v'
Monitoring OVH with Time Series
OVH Metrics A metrics data platform
Tools to deal with Time Series Many options
Metrics Data Platform
Metrics' metrics ● 1.5M datapoints/s, 24/7 ● Peaks at ~10M datapoints/s ● 500M unique series
Metrics Data Platform + +
Why Warp 10? Warp 10 is a software platform that ● Ingests and stores time series ● Manipulates and analyzes time series
Analytics is the key to success Fetching data is only the tip of the iceberg
Manipulating Time Series with Warp 10 A true Time Series analysis toolbox ● ● ● Hundreds of functions Manipulation frameworks Analysis workflow
Anatomy of a time series Each time series is composed of: org.nasa.kepler.starlight { keplerId: 52163778 } Metadata ▪ ▪ Class name Labels Datapoints ▪ ▪ Timestamp Value
Class names and labels Class names define the kind of measure ▪ Starlight, heart rate, speed… org.nasa.kepler.starlight { keplerId: 52163778 } Labels define particular traits of a TS ▪ Device Id, Device Type, Private User Id…
A match made in heaven Warp 10, OVH Metrics and HelloExoWorld
What we have done ● ● ● ● Downloaded and parsed 40 millions of FITS files Pushed it to OVH Metrics Select a cool subset as training set Verified we could find the same planets as NASA
From kepler-11 raw data
To (candidates) exoplanets
Your job
What's next? Where do we go from here?
Only the beginning Better detection New import method Explorer Deep learning satellite/star location Yours?
A growing team
And you! Join us! https://helloexo.world https://xkcd.com/1371/
OVH Platform Come speak with us about your kubernetes, time-series or observability projects and OVH Platform
Thank you, dear sponsors!
Thank you!
Cela fait des années que l’humanité explore le ciel, rêvant de voyages intersidérales et de nouvelles colonies planétaires. Et toi, as-tu envie de partir 3h avec nous à la découverte de l’univers ?
Il se trouve que la NASA possède un formidable jeu de données publiques, notamment celui qui est utilisé pour la recherche d’exoplanètes, c’est-à-dire de planètes situées en dehors de notre système solaire.
Nous vous guiderons, au cours de ce Hands-on, dans les différentes étapes permettant de redécouvrir des exoplanètes en utilisant Warp10, une plateforme open-source de traitement de séries temporelles.