@rawkode
Dia dhuit David McKay @rawkode
DublinPHP
Head of Developer Relations @InfluxDB | #InfluxDB
Slide 2
Introduction to Time Series
Slide 3
@rawkode
Before we begin …
Slide 4
@rawkode
Pop Quiz “Invented” When?
Slide 5
@rawkode
Encoding First Used … 410 ? BC
Slide 6
@rawkode
Encoding “Documented” in The Lives of the Noble Grecians and Romans, by Roman historian Plutarch.
Slide 7
@rawkode
Alcibiades suddenly raised the Athenian ensign in the admiral shop, and fell upon those galleys of the Peloponnesians …
Slide 8
@rawkode
Encoding In the 14th century, things hadn’t actually advanced much more. The Black Book of Admiralty listed 2 signals: 1 flag or 2 flags
Slide 9
@rawkode
Encoding By the 15th century there were 15 flags, each with a single meaning.
Slide 10
@rawkode
Encoding Finally, in the late 17th century; a French system existed (Mahé de la Bourdonnais) with 10 coloured flags, representing 0-9
Slide 11
@rawkode
Sharding First Used … 150 ? BC
Slide 12
@rawkode
Sharding First “documented” example was in ~150 AD, invented and described by Polybius.
Slide 13
@rawkode
We take the alphabet and divide it into five parts, each consisting of five letters.
Slide 14
@rawkode
Slide 15
@rawkode
Slide 16
History of Time Series
Slide 17
@rawkode
The Romans Did It The earliest form of a company which issued public shares was the case of the publicani during the Roman Republic.
Slide 18
@rawkode
Like modern joint-stock companies, the publicani were legal bodies independent of their members whose ownership was divided into shares, or partes. There is evidence that these shares were sold to public investors and traded in a type of over-the-counter market in the Forum, near the Temple of Castor and Pollux. The shares fluctuated in value, encouraging the activity of speculators, or quaestors.
Slide 19
@rawkode
In 1602 … First IPO: Dutch East India Company
Slide 20
@rawkode
In 1873 … First US IPO: Bank of North America
Slide 21
@rawkode
In 1884 … What was the price of wheat?
Slide 22
@rawkode
First Documented Time Series A Comparison of the Fluctuations in the Price of Wheat and in the Cotton and Silk Imports into Great Britain
J. H. Poynting Journal of the Statistical Society of London Vol. 47, No. 1 (Mar., 1884), pp. 34-74
Slide 23
@rawkode
What is all this? This is the first (or one of) paper that added the dimension of time to statistical mathematics
Slide 24
@rawkode
Most data is best understood in the dimension of time @pauldix, CTO
Slide 25
Introduction to Time Series
Slide 26
@rawkode
What Will We Cover? ➔ ➔ ➔ ➔ ➔
Time Series Data Time Series Databases Getting to Know InfluxDB Value of Time Series Data Advancing Monitoring to Time Series
Slide 27
Time Series Data What is it?
Slide 28
@rawkode
Time Series Data Data with a timestamp
Slide 29
@rawkode
Slide 30
@rawkode
Slide 31
@rawkode
Slide 32
@rawkode
Slide 33
@rawkode
Slide 34
@rawkode
Slide 35
@rawkode
What is Time Series Data?
Slide 36
@rawkode
What is Time Series Data? Regular (Metrics) ➔ Predictable ➔ Evenly Distributed
Irregular (Events) ➔ Unpredictable ➔ Inconsistent Intervals
Slide 37
@rawkode
Regular / Metrics ★ ★ ★ ★
CPU Usage Memory Usage Ping Time for Google.com Number of Processes
Slide 38
@rawkode
Irregular / Events ★ ★ ★ ★
User Clicked Login Authentication Failed CI Published v1.3.1 Network Cable Unplugged
Slide 39
Slide 40
Slide 41
@rawkode
Collecting Metrics & Events With Prometheus Exporters or Telegraf
@rawkode
Push AND Pull Metrics are pulled at a regular interval
Events NEED to be pushed as they happen
Consistent and reliable intervals
Inconsistent intervals
Slide 45
@rawkode
Time Series Data Use Cases
Slide 46
@rawkode
Use Cases for Time Series Monitoring ➔ ➔ ➔
Infrastructure Applications Third Party Services
IoT / Sensor ➔ ➔ ➔ ➔ ➔
Thermostats Electric Engines Smart Things GPS Fitbits
Real Time Analytics ➔ ➔ ➔
Website Tracking Stock Prices Currency Exchange Rates
Slide 47
Time Series Databases TSDB’s
Slide 48
@rawkode
Time Series Databases Time Series databases are optimized for collecting, storing, retrieving, and processing of Time Series data.
Slide 49
@rawkode
Time Series Databases ➔
High Write Frequency
➔
Reads are range scans
➔
TTL / Lifecycle Management
➔
Time Sensitive
Slide 50
Slide 51
@rawkode
12% Are you in the 88%?
Slide 52
Slide 53
Slide 54
Slide 55
Slide 56
Slide 57
@rawkode
13% It’s Not Too Late!
Slide 58
@rawkode
Slide 59
@rawkode
Disclaimer Most of this isn’t unique to InfluxDB
@rawkode
Points At any point in time, this value was N
Slide 63
@rawkode
Point
● Series ● Fields ● Timestamp
load,host=vm1 1m=6.32,5m=8.20,15m=9.55 123456789
Slide 64
@rawkode
Series
● Name ● Tag Keys ● Tag Values
● load,host=vm1 ● stock_price,market=NASDAQ,ticker=GOOG ● users,service=comments
Slide 65
@rawkode
Tags & Fields Tags ➔ Indexed ➔ String Types
Fields ➔ Not Indexed ➔ Multiple Data Types
Slide 66
Value of Time Series Data Isn’t It Valuable Forever?
Slide 67
@rawkode
Resolution
The predictable interval at which we will collect our time series data
Slide 68
@rawkode
Value of Time Series Data
The value of all time series data is directly correlated with the resolution that the data is available
Slide 69
Cost of Time Series Data Wait, Isn’t It Free?!
Slide 70
@rawkode
Example cpu,machine=abc1 usage=1.66 timestamp
Slide 71
@rawkode
Resolution ➔ 1 Measurement ➔ 1 Series ➔ 1s Resolution
86400 Points Per Day
Slide 72
@rawkode
Resolution ➔ 1 Measurement ➔ 2 Series ➔ 1s Resolution
172800 Points Per Day
Slide 73
@rawkode
Resolution ➔ 5 Measurement ➔ 10 Series ➔ 1s Resolution
4320000 Points Per Day
Slide 74
@rawkode
Nasdaq ➔ 1 Measurement ➔ 3300 Series ➔ 1ms Resolution
28512000 0000 Points Per Day
Slide 75
@rawkode
Nasdaq ➔ 1 Measurement ➔ 3300 Series ➔ 1m Resolution
4752000 Points Per Day
Slide 76
@rawkode
Nasdaq ➔ 1 Measurement ➔ 3300 Series ➔ 1h Resolution
79200 Points Per Day
Slide 77
@rawkode
Nasdaq ➔ 1 Measurement ➔ 3300 Series ➔ 6h Resolution
13200 Points Per Day
Slide 78
@rawkode
Rollups Lowering the Resolution
Slide 79
@rawkode
Rollups with Continuous Queries CREATE CONTINUOUS QUERY “rollup_1h” ON “nasdaq” BEGIN SELECT mean(price) INTO yearly FROM weekly GROUP BY time(1h) END
Advancing Monitoring to Time Series Taking Small Steps for Giant Leaps
Slide 82
CPU > 80%
MEM > 80% Application
Database Response Time > 300ms
Black Friday
Slide 83
Application
When the application fails the health-check
How do we know when to send a page to SRE / Ops?
Database
Slide 84
Application
How do we know when to send a page to SRE / Ops?
Application
Database
Application
When we get more than 100 [ 5xx | Exceptions ] within a 5 minute period
Slide 85
Service A
Service B
Service B
Service C Canary
Virtual Network
Service Mesh
Ummm?
Database A
Database B
Database C
Slide 86
@rawkode
Cloud Native Architectures Convenience Vs. Cost You can treat the symptoms for a while … Upgrade Your Monitoring
Slide 87
@rawkode
Causality Treating the Disease
Slide 88
@rawkode
Causality ➔ Look at last weeks, months, and years of data ➔ Use tags to build correlation ➔ Get Statistical ◆ ◆ ◆ ◆ ◆
INTEGRAL() LINEAR_PREDICTION() DERIVATIVE() MOVING_AVERAGE() HOLT_WINTERS()
Slide 89
@rawkode
Causality Have you ever been paged at 4am because the disk usage of a machine went above 85%? Could this have been determined during office hours? (Linear Growth) Can we use correlations to determine the cause during anomalies?
Slide 90
@rawkode
Causality In our distributed application, our p99 reports that our users are being served healthy responses in under 2ms. Our pager is going off because we’ve getting too many exceptions in the code SELECT mode(*) FROM logs;
Slide 91
@rawkode
Proactive Ops We run Big News Corp and we need to reduce our cloud costs. Instead of running at 30% utilisation, can we run at 80% utilisation? HOLT_WINTERS
Slide 92
@rawkode
Build Automation Through Causality, Historical Data, Prediction, and ML
Slide 93
@rawkode
Summary ➔ Use a TSDB
➔ Rollup metrics
➔ Understand Cost / Select Tags Wisely
➔ Perform outlier detection on events
➔ Understand the resolution you need for 1m, 6m, > 12m
➔ Build automation, dashboarding, and reporting around your data (past, present, and future)
Slide 94
@rawkode
Cheers! David McKay @rawkode
That’s All Folks!
Head of Developer Relations @InfluxDB | #InfluxDB