Cloud Native Telegraf Cloud Native London September 2019

🏴󠁧󠁢󠁳󠁣󠁴󠁿 Scottish David McKay InfluxData Developer Advocate 2 © 2019 InfluxData. All rights reserved. 💙 Esoteric Programming Languages ☸ Kubernetes Release Team 🚒 Former SRE 🍝 Former Developer @rawkode

Cloud Native Telegraf 3 © 2019 InfluxData. All rights reserved.

Can I have one Telegraf, please? 4 © 2019 InfluxData. All rights reserved.

Telegraf github.com/influxdata/telegraf Telegraf is an agent for collecting, processing, aggregating, and writing metrics. 5 © 2019 InfluxData. All rights reserved. @rawkode

Architecture GCP Third Party Systems Telegraf ? Your Application 6 © 2019 InfluxData. All rights reserved. @rawkode

Telegraf is Agnostic 7 © 2019 InfluxData. All rights reserved.

Architecture GCP Third Party Systems StackDriver Telegraf InfluxDB Prometheus Your Application 8 © 2019 InfluxData. All rights reserved. @rawkode

Plugins Inputs ★ ★ ★ ★ ★ ★ Docker Kafka Kubernetes Nats Postgres System ○ ○ ○ ○ ○ 9 CPU Disk Disk IO Mem Process © 2019 InfluxData. All rights reserved. Outputs ➔ ➔ ➔ ➔ ➔ ➔ ➔ ➔ ➔ ➔ CrateDB CloudWatch DataDog Elasticsearch Graphite InfluxDB OpenTSDB Prometheus StackDriver Wavefront @rawkode

Plugins 10 Inputs Outputs

160 35 © 2019 InfluxData. All rights reserved. @rawkode

Input: activemq Slide 9 / 247 11 © 2019 InfluxData. All rights reserved.

Input: kubernetes Slide 12 / 48 12 © 2019 InfluxData. All rights reserved.

Kubernetes ➔ Should be run as a DaemonSet ➔ Hits the stats/summary endpoint of each kubelet ➔ Is responsible for gathering metrics for pods and their containers ➔ Will produce high cardinality data 13 © 2019 InfluxData. All rights reserved. @rawkode

Kubernetes [[inputs.kubernetes]] url = “https://localhost:10255” bearer_token = “/run/secrets/token insecure_skip_verify = true 14 © 2019 InfluxData. All rights reserved. @rawkode

Kubernetes For Cloud Providers Managed Kubernetes or minikube [[inputs.kubernetes]] url = “https://kubernetes.default/api/v1/nodes/$NODE_NAME/proxy/ ” 15 © 2019 InfluxData. All rights reserved. @rawkode

Kubernetes Improvements ➔ 99.97% of the time, this plugin will run in-cluster ◆ No reference, I made this number up ➔ So we don’t need any configuration ◆ We should trust you to manage RBAC ◆ We’ll use mounted ServiceAccount ◆ We’ll infer URL 16 © 2019 InfluxData. All rights reserved. @rawkode

Input: kube_inventory Slide 10 / 20 17 © 2019 InfluxData. All rights reserved.

Kube Inventory ➔ Should be run as a Deployment, with a single replica ➔ Hits the APIServer for resource information ➔ Will give you information on Deployments, DaemonSets, Volumes, etc, etc ➔ Will produce high cardinality data 18 © 2019 InfluxData. All rights reserved. @rawkode

Kube Inventory [[inputs.kube_inventory]] url = “https://kubernetes.default” bearer_token = “” resource_exclude = [] resource_include = [] 19 © 2019 InfluxData. All rights reserved. @rawkode

Kube Inventory Improvements ➔ 99.97% of the time, this plugin will run in-cluster ◆ I heard this once before ➔ So we don’t need any configuration ◆ We should trust you to manage RBAC ◆ We’ll use mounted ServiceAccount ◆ We’ll infer URL 20 © 2019 InfluxData. All rights reserved. @rawkode

Input: prometheus Slide 10 / 20 21 © 2019 InfluxData. All rights reserved.

Prometheus ➔ Run it however you want ◆ Globally ◆ Per Namespace ◆ Depends on your workloads ➔ Will scrape Prometheus endpoints ➔ Will discover services through Prometheus annotations 22 © 2019 InfluxData. All rights reserved. @rawkode

Prometheus [[inputs.prometheus]] monitor_kubernetes_pods = true # monitor_kubernetes_pods_namespace = “” bearer_token = “” 23 © 2019 InfluxData. All rights reserved. @rawkode

Prometheus Improvements ➔ 99.97% of the time, this plugin will run in-cluster ◆ Definite fact, I’ve heard this more than once ➔ So we don’t need any configuration ◆ We should trust you to manage RBAC ◆ We’ll use mounted ServiceAccount 24 © 2019 InfluxData. All rights reserved. @rawkode

Prometheus Improvements ➔ Support ServiceMonitor CRD (Prometheus Operator) 25 © 2019 InfluxData. All rights reserved. @rawkode

Output: influxdb 26 © 2019 InfluxData. All rights reserved.

InfluxDB [[outputs.influxdb]] urls = [“http://influxdb.monitoring:8086”] [[outputs.influxdb_v2]] urls = [“http://influxdb.monitoring:9999”] organization = “InfluxData” bucket = “kubernetes” token = “secret-token” 27 © 2019 InfluxData. All rights reserved. @rawkode

Output: prometheus_client 28 © 2019 InfluxData. All rights reserved.

Prometheus Client [[outputs.prometheus_client]] ## Address to listen on. listen = “:9273” 29 © 2019 InfluxData. All rights reserved. @rawkode

Telegraf Super Powers 30 © 2019 InfluxData. All rights reserved.

Proxying 31 © 2019 InfluxData. All rights reserved.

Proxying influxdb_listener is a service input plugin that listens for requests sent according to the InfluxDB HTTP API. The intent of the plugin is to allow Telegraf to serve as a proxy/router for the /write endpoint of the InfluxDB HTTP API. 32 © 2019 InfluxData. All rights reserved. @rawkode

Proxying http_listener_2 is a service input plugin that listens for metrics sent via HTTP. Metrics may be sent in ANY supported data format. 33 © 2019 InfluxData. All rights reserved. @rawkode

Proxying There’s also socket_listener, tcp_listener, and udp_listener 34 © 2019 InfluxData. All rights reserved. @rawkode

Batching 35 © 2019 InfluxData. All rights reserved.

Batching Telegraf will send metrics to outputs in batches of at most metric_batch_size metrics. This controls the size of writes that Telegraf sends to output plugins. 36 © 2019 InfluxData. All rights reserved. @rawkode

Buffering 37 © 2019 InfluxData. All rights reserved.

Buffering If a write to an output fails, Telegraf will hold metric_buffer_limit worth of metrics in-memory before data is lost. This is PER output 38 © 2019 InfluxData. All rights reserved. @rawkode

These 2 simple settings get you redundancy, high availability, and performance optimisation of the write path. 39 © 2019 InfluxData. All rights reserved.

Telegraf as a Sidecar 40 © 2019 InfluxData. All rights reserved.

Telegraf as a Sidecar Hopefully from everything I’ve discussed, you can see how Telegraf could be a useful addition to any application as a sidecar. 1. It can consume logs 2. You can write events / traces from your code 3. It can act as a local metric buffer during DB downtime 41 © 2019 InfluxData. All rights reserved. @rawkode

Telegraf as a Sidecar Unfortunately … The Telegraf binary is around 80MiB The Telegraf image is around 250MiB / 80MiB 42 © 2019 InfluxData. All rights reserved. @rawkode

BYOT: Bring Your Own Telegraf 43 © 2019 InfluxData. All rights reserved.

Bring Your Own Telegraf FROM rawkode/telegraf:byo AS build FROM alpine:3.7 AS telegraf COPY —from=build /etc/telegraf /etc/telegraf COPY —from=build /go/src/github.com/influxdata/telegraf/telegraf /bin/telegraf 44 © 2019 InfluxData. All rights reserved. @rawkode

Telegraf Operator 45 © 2019 InfluxData. All rights reserved.

Telegraf Operator apiVersion: influxdata.com/v1 kind: Telegraf metadata: name: mine spec: version: “1.12” scrape_prometheus: false sidecar_injection: true metric_server: true 46 © 2019 InfluxData. All rights reserved. @rawkode

Demo Time 47 © 2019 InfluxData. All rights reserved.

48 © 2019 InfluxData. All rights reserved. @rawkode

🐦 @rawkode 🐦 Thank You