Rediscover the known Universe with NASA datasets Horacio Gonzalez @LostInBrittany Introduction to Time Series
@LostInBrittany
Slide 2
Horacio Gonzalez @LostInBrittany Spaniard lost in Brittany, developer, dreamer and all-around geek
Introduction to Time Series
@LostInBrittany
Slide 3
HelloExoWorld
Looking for exoplanets in NASA datasets Introduction to Time Series
@LostInBrittany
Slide 4
HelloExoWorld Once upon a time...
Introduction to Time Series
@LostInBrittany
Slide 5
An amateur astronomer
Pierre Zemb, DevOps OVH Introduction to Time Series
@LostInBrittany
Slide 6
What not to do if you love astronomy
Live in Brest Introduction to Time Series
@LostInBrittany
Slide 7
Looking for solutions
Computer stuff
Astronomy
Mixing passions Introduction to Time Series
@LostInBrittany
Slide 8
Google is your friend...
Let's find a project Introduction to Time Series
@LostInBrittany
Slide 9
Exoplanets?
Planets orbiting stars far away Introduction to Time Series
@LostInBrittany
Slide 10
How do we find them?
The transit method seems the best Introduction to Time Series
@LostInBrittany
Slide 11
The transit method
Credits: NASA’s Goddard Space Flight Center
Introduction to Time Series
@LostInBrittany
Slide 12
How do we look for transits?
Image credits : NASA
Kepler Introduction to Time Series
@LostInBrittany
Slide 13
Watching the sky
By Carter Roberts [Public domain], via Wikimedia Commons
Introduction to Time Series
@LostInBrittany
Slide 14
And what kind of data we get?
Pleiades By NASA, ESA, AURA/Caltech, Palomar Observatory. Via Wikimedia Common
Introduction to Time Series
@LostInBrittany
Slide 15
Well, that's the problem
Seven stars, seven different profiles Introduction to Time Series
@LostInBrittany
Slide 16
Kinda big data
Over 40 million light curves Introduction to Time Series
@LostInBrittany
Slide 17
Big AND open data
Lots of datasets in #opendata Introduction to Time Series
@LostInBrittany
Slide 18
And we can help with that!
Let's use our tools to analyse the data Introduction to Time Series
@LostInBrittany
Slide 19
A match made in heaven Warp 10, OVH Metrics and HelloExoWorld
Introduction to Time Series
@LostInBrittany
Slide 20
What we have done ● ● ● ●
Downloaded and parsed 40 millions of FITS files Pushed it to OVH Metrics Select a cool subset as training set Verified we could find the same planets as NASA
Introduction to Time Series
@LostInBrittany
Slide 21
Choosing a star: Kepler 11
Image credit: NASA/Tim Pyle
Introduction to Time Series
@LostInBrittany
Slide 22
Looking at the raw signal...
SAP_FLUX: The flux in units of electrons per second contained in the optimal aperture pixels collected by the spacecraft.
Introduction to Time Series
@LostInBrittany
Slide 23
Looking at the raw signal... ?
SAP_FLUX: The flux in units of electrons per second contained in the optimal aperture pixels collected by the spacecraft.
Introduction to Time Series
@LostInBrittany
Slide 24
Looking at one record
Perturbations in dirty signals Introduction to Time Series
@LostInBrittany
Slide 25
Transits are tiny
~40 electrons per second Introduction to Time Series
@LostInBrittany
Slide 26
First step: downsampling
Introduction to Time Series
@LostInBrittany
Slide 27
First step: downsampling
You can see the transit candidates… but how can we teach the computer to see them? Introduction to Time Series
@LostInBrittany
Slide 28
If you ♥ signal processing
High pass filter Introduction to Time Series
@LostInBrittany
Slide 29
Poor person's high pass filter
Using the trend Introduction to Time Series
@LostInBrittany
Slide 30
Signal - Trend
Now you can see them well Introduction to Time Series
@LostInBrittany
Slide 31
After some tuning
We have our transit candidates Introduction to Time Series
@LostInBrittany
Slide 32
What's next? Where do we go from here?
Introduction to Time Series
@LostInBrittany
Slide 33
Only the beginning Better detection
New import method
Explorer
Deep learning
satellite/star location
Introduction to Time Series
Yours?
@LostInBrittany
Slide 34
A growing team
Introduction to Time Series
@LostInBrittany
Slide 35
And you!
Join us!
https://helloexo.world
https://xkcd.com/1371/
Introduction to Time Series
@LostInBrittany
Slide 36
Thank you!
Introduction to Time Series
@LostInBrittany
Slide 37
Want to know more? Analysing with WarpScript
Introduction to Time Series
@LostInBrittany
Slide 38
WarpScript
Reverse Polish Notation
Introduction to Time Series
@LostInBrittany
Slide 39
Variables
‘hello, world!’
// Push Hello World String on the Stack
‘exo’ STORE
// Store it in a variable called exo
$exo
// Then push back exo variable on the stack
Introduction to Time Series
@LostInBrittany
Slide 40
What are the available series?
[ $readToken
// Application authentication
'~.*'
// selector for classname
{}
// Selector for labels
] FIND
Introduction to Time Series
@LostInBrittany
Slide 41
Get raw data [ $readToken
// Application authentication
'sap.flux'
// selector for classname
{ 'KEPLERID' '6541920' }
// Selector for labels
'2009-05-02T00:56:10.000000Z'
// Start date
'2013-05-11T12:02:06.000000Z'
// End date
] FETCH
Introduction to Time Series
@LostInBrittany
Slide 42
Kepler-11: Raw data
Introduction to Time Series
@LostInBrittany
Slide 43
Time manipulation
Introduction to Time Series
@LostInBrittany
Slide 44
Time related functions
Introduction to Time Series
@LostInBrittany
Slide 45
How to split a Time series
$gts
// Singleton (or list of) GTS
6h
// Minimum of time without data-points
100
// Minimum of data-points required
'record'
// New labels to subdivide the result
TIMESPLIT
Introduction to Time Series
@LostInBrittany
Slide 46
Filtering [ $gts
// Singleton (or list of) GTS
[]
// Equivalence classes
{ 'record' '5' }
// Labels to select
filter.bylabels
// Type of filter
] FILTER
Introduction to Time Series
@LostInBrittany
Slide 47
Reference record: 5
Introduction to Time Series
@LostInBrittany
Slide 48
Downsampling
Introduction to Time Series
@LostInBrittany
Slide 49
Bucketize
Introduction to Time Series
@LostInBrittany
Slide 50
Syntax Time series parameter [ $gts bucketizer.min 0 Singleton
2h 0 ] BUCKETIZE
Time-series set
Introduction to Time Series
@LostInBrittany
Slide 51
Syntax Bucketizer [ $gts bucketizer.min 0 2h 0 ] BUCKETIZE
Type of operator to apply on each bucket last, max, mean, and, count ...
Introduction to Time Series
@LostInBrittany
Slide 52
Syntax Lastbucket [ $gts bucketizer.min 0 2h 0 ]
End timestamp of the more recent bucket
BUCKETIZE
Introduction to Time Series
@LostInBrittany
Slide 53
Syntax Bucketspan [ $gts bucketizer.min 0 2h 0 ]
Width of a bucket
BUCKETIZE
Introduction to Time Series
@LostInBrittany
Slide 54
Syntax Bucketcount [ $gts bucketizer.min 0 2h 0 ]
Number of buckets to keep
BUCKETIZE
Introduction to Time Series
@LostInBrittany
Slide 55
Actual
Introduction to Time Series
@LostInBrittany
Slide 56
Trend
Introduction to Time Series
@LostInBrittany
Slide 57
Mapper
Introduction to Time Series
@LostInBrittany
Slide 58
Syntax Time series parameter [ $gts mapper.mean 2 Singleton
2 0 ] MAP
Time-series set
Introduction to Time Series
@LostInBrittany
Slide 59
Syntax Mapper [ $gts mapper.mean 2 2 0 ] MAP
Type of operator to apply on each window add, gt, rate, and, count...
Introduction to Time Series
@LostInBrittany
Slide 60
Syntax Pre [ $gts mapper.mean 2 2 0 ]
Number of data-points before
MAP
Introduction to Time Series
@LostInBrittany
Slide 61
Syntax Post [ $gts mapper.mean 2 2 0 ]
Number of data-points after
MAP
Introduction to Time Series
@LostInBrittany
Slide 62
Syntax Occurrence [ $gts mapper.mean 2 2 0 ]
Maximal number of calculation for a data-point
MAP
Introduction to Time Series
@LostInBrittany
Slide 63
Actual
Introduction to Time Series
@LostInBrittany
Slide 64
Trend
Introduction to Time Series
@LostInBrittany
Slide 65
Actual - trend
Introduction to Time Series
@LostInBrittany
Slide 66
Actual - trend
Introduction to Time Series
@LostInBrittany
Slide 67
Time to level-up!
Introduction to Time Series
@LostInBrittany
Slide 68
Time series operation [ $gts0
// First series pull
…
// …
$gtsN
// N series pull
[ ‘record’ ]
// Key labels list
op.add
// Type of operator
] APPLY
Introduction to Time Series
@LostInBrittany
Slide 69
Syntax Time series parameter [ $gts0 … $gtsN Singleton
[ ‘record’ ] op.add ] APPLY
Time-series set
Introduction to Time Series
@LostInBrittany
Slide 70
Syntax Equivalence class [ Records data
$gts0 … $gtsN [ ‘record’ ] op.add ]
Record 1
APPLY
Record 3 Record 2
Introduction to Time Series
@LostInBrittany
Slide 71
Syntax Operator [ $gts0
Record 1
Record 3
…
Record 2
$gtsN [ ‘record’ ] op.add ] APPLY
Type of operator to apply on each class sub, gt, mask, and, mul ...
Introduction to Time Series
@LostInBrittany
Slide 72
Final result
Introduction to Time Series
@LostInBrittany