Breaking Free with Open Standards: OpenTelemetry and Perses for Observability

A presentation at Container Days Conference in September 2025 in Hamburg, Germany by Kasper Borg Nissen

Slide 1

Slide 1

Container Days Hamburg - September 2025 Breaking Free with Open Standards: OpenTelemetry and Perses for Observability Kasper Borg Nissen, Developer Advocate at @phennex kaspernissen.xyz

Slide 2

Slide 2

Who? Developer Advocate at Dash0 KubeCon+CloudNativeCon EU/NA 24/25 Co-Chair (former) CNCF Ambassador Golden Kubestronaut CNCG Aarhus, KCD Denmark Organizer Co-founder & Community Lead Cloud Native Nordics

Slide 3

Slide 3

https://university.platformengineering.org/observability-for-platform-engineering

Slide 4

Slide 4

tl;dr ● OpenTelemetry is standardizing telemetry collection. ● Perses is standardizing dashboarding. ● Applying Platform Engineering principles transforms observability into a seamless, scalable, and developer-friendly experience. ● Building on Open Standards allows you to freely move between vendors, ensuring they stay on their toes and provide you the best possible experience. @phennex kaspernissen.xyz

Slide 5

Slide 5

Observability is still fragmented Metrics @phennex kaspernissen.xyz Logs Traces Image by pngtree.com

Slide 6

Slide 6

Observability is still fragmented We donʼt have a metrics problem, or a tracing problem. We have systems problems. Metrics @phennex kaspernissen.xyz Logs Traces Image by pngtree.com

Slide 7

Slide 7

This fragmentation, leads to @phennex kaspernissen.xyz Image by pngtree.com

Slide 8

Slide 8

This fragmentation, leads to Complex Query Languages @phennex kaspernissen.xyz Image by pngtree.com

Slide 9

Slide 9

This fragmentation, leads to Complex Query Languages @phennex kaspernissen.xyz Vendor lock-in Image by pngtree.com

Slide 10

Slide 10

This fragmentation, leads to Complex Query Languages @phennex kaspernissen.xyz Vendor lock-in Metadata Inconsistency Image by pngtree.com

Slide 11

Slide 11

This fragmentation, leads to Complex Query Languages Vendor lock-in Metadata Inconsistency No instrumentation due to high complexity @phennex kaspernissen.xyz Image by pngtree.com

Slide 12

Slide 12

This fragmentation, leads to Complex Query Languages Vendor lock-in No instrumentation due to high complexity @phennex kaspernissen.xyz Metadata Inconsistency Lack of unified insights Image by pngtree.com

Slide 13

Slide 13

A shift is happening. @phennex kaspernissen.xyz

Slide 14

Slide 14

A shift toward correlation Find related information @phennex kaspernissen.xyz Jump between signals Reconstruct chain of events

Slide 15

Slide 15

A shift toward correlation @phennex kaspernissen.xyz

Slide 16

Slide 16

A shift toward… @phennex kaspernissen.xyz

Slide 17

Slide 17

OpenTelemetry OpenTelemetry OTel is an open source project designed to provide standardized tools and APIs for generating, collecting, and exporting telemetry data such as traces, metrics, and logs The de-facto standard for distributed tracing, also supports metrics and logs (soon profiling) The main goals of the project are: ● ● ● Unified telemetry Vendor-neutrality Cross-platform @phennex kaspernissen.xyz

Slide 18

Slide 18

OpenTelemetry in a nutshell 2nd largest CNCF project by contributor count ✅ What it is @phennex kaspernissen.xyz A set of various things focused on letting you collect telemetry about systems: ● ● ● ● ● ● Data models API specifications Semantic conventions Library implementations in many languages Utilities and much more

Slide 19

Slide 19

OpenTelemetry in a nutshell ⛔ What it is NOT @phennex kaspernissen.xyz ● ● ● ● ● ● Proprietary An all-in-one observability tool A data storage or dashboarding solution A query language A Performance Optimizer Feature complete

Slide 20

Slide 20

1/1/20241/1/2025 Commits: 27.168 PRs+Issues: 58.508 Source: CNCF Velocity Commits: 44.486 PRs+Issues: 56.299

Slide 21

Slide 21

OpenTelemetry: A 1000 miles view Instrumentation OTel API & SDK Telemetry Backends The OpenTelemetry Collector auto-instrumentation Time-series database … Log database Receive Process Analysis Tools Export Trace database Infrastructure … Kubernetes … Generate and Emit @phennex transmit kaspernissen.xyz Collect, Convert, Process, Route, Export transmit Inspired by visualizations from LFS148 Store & Analyze

Slide 22

Slide 22

OpenTelemetry: A 1000 miles view OTel API & SDK auto-instrumentation Vendor space Collection of Telemetry is The OpenTelemetry Collector standardized … Receive Process Export Infrastructure Kubernetes … “The last observability agent you will ever installˮ … and many more. Generate and Emit @phennex transmit kaspernissen.xyz Collect, Convert, Process, Route, Export transmit Store & Analyze

Slide 23

Slide 23

Signals METRICS 42 LOGS TRACES 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 + @phennex kaspernissen.xyz Real user monitoring (browser, app) PROFILES

Slide 24

Slide 24

Telemetry without context is just data @phennex kaspernissen.xyz

Slide 25

Slide 25

What are we looking at? @phennex kaspernissen.xyz

Slide 26

Slide 26

What are we looking at? Awww… Adorable! Cute Cuteness Pretty Normal Unfortunate Creepy Reddit /r/funny, “Cuteness Vs Number of legsˮ (circa 2010 @phennex Gaah! Kill it! Kill it! kaspernissen.xyz 0 1 2 3 4 5 Number of Legs 6 7 8

Slide 27

Slide 27

How we talk about system context Organization By whom) 1 Architecture What / Why) Which service / system component is this? 2 Compute How/2 3 Platform How Kubernetes? Which cluster / namespace / deployment / cronjob / job / pod? AWS ECS? Which cluster / service / task? … 4 kaspernissen.xyz Which container? Which process? Pid? Startup args? Which runtime is it? Node.js? JVM? .NET? Which build? Which version? … Infrastructure Where 5 @phennex Which team owns it? “Who you gonna call?ˮ .. Which datacenter / Cloud region / availability zone / account does it run in? …

Slide 28

Slide 28

How to set resource attributes? ● ● ● Resource detectors & manual “hard-codingˮ. OTEL_RESOURCE_ATTRIBUTES env var Added to telemetry “in transitˮ using the OpenTelemetry Collector. import import import import { { { { NodeSDK } from ‘@opentelemetry/sdk-node’; ConsoleSpanExporter } from ‘@opentelemetry/sdk-trace-node’; envDetector, processDetector, Resource} from ‘@opentelemetry/resources’; awsEcsDetector } from ‘@opentelemetry/resource-detector-aws’; const sdk = new NodeSDK({ traceExporter: new ConsoleSpanExporter(), // Skip metric exporter, auto-instrumentations and more. See // https://opentelemetry.io/docs/languages/js/getting-started/nodejs/ instrumentations: [getNodeAutoInstrumentations()], // Specify which resource detectors to use resourceDetectors: [envDetector, processDetector, awsEcsDetector], // Hard-coded resource resources: [new Resource({ team: ‘awesome’, })], }); sdk.start(); Sample initialization of the OpenTelemetry JS Distro in a Node.js application

Slide 29

Slide 29

without context semantic conventions is just data @phennex kaspernissen.xyz

Slide 30

Slide 30

Semantic Conventions Semantic Conventions define a common set of (semantic) attributes which provide meaning to data when collecting, producing and consuming it. https://github.com/open-telemetry/semantic-conventions Semantic Conventions by signals: ● ● ● ● ● Events: Semantic Conventions for event data. Logs: Semantic Conventions for logs data. Metrics: Semantic Conventions for metrics. Resource: Semantic Conventions for resources. Trace: Semantic Conventions for traces and spans. @phennex kaspernissen.xyz

Slide 31

Slide 31

OpenTelemetry semantic conventions to context layers 1 Organization 😢 Architecture Service (stable) and (experimental) Deployment Environment 2 Compute 3 Platform Kubernetes Cloud (cloud.platform specifically) Cloud-provider specific @phennex 4 kaspernissen.xyz COM NOT PRE A HE LIST NSIVE ! Telemetry SDK (stable) and (experimental) Compute Unit and Instance Operating System Process & Process Runtimes Device, Browser, Webengine, … … 5 Infrastructure Cloud (general stuff)

Slide 32

Slide 32

So, why OpenTelemetry? Instrument once, use everywhere @phennex Separate telemetry generation from analysis kaspernissen.xyz Make software observable by default Improve how we use telemetry

Slide 33

Slide 33

Thatʼs all great, but how do I make it easily accessible for my developers? @phennex kaspernissen.xyz

Slide 34

Slide 34

The dual role of Platform Engineers in Observability @phennex kaspernissen.xyz

Slide 35

Slide 35

Observe the running infrastructure Provide Observability as a product for developers (metrics, logs) (traces, metrics, logs, profiling) @phennex kaspernissen.xyz

Slide 36

Slide 36

What types of Telemetry do I need? Prevalent telemetry types End-user devices and IoT — — —— —- — — — —— — ——- — —

  • RUM Runtimes, applications and services — — —— —- — — — —— — ——- — —
  • RUM Cloud, FaaS, Container orchestration — — —— —- — — — —— — ——- — — Operating system — — —— —- — — — —— — ——- — — Virtualisation Bare metal Based on: “What is observability?ˮ by ubuntu.com Infrastructure context — — —— —- — — — —— — ——- — — — — —— —- — — — —— — ——- — — Application context

Slide 37

Slide 37

Platform Engineering for Observability Self-Service Experience Explicit and Consistent APIs Golden Paths Modularity Platform as a Product Core Requirements @phennex kaspernissen.xyz

Slide 38

Slide 38

Platform Engineering for Observability Self-Service Experience Auto-Instrumentation Explicit and Consistent APIs Semantic Conventions Golden Paths Observability built-in Modularity Collector Pipelines Platform as a Product Documentation + Support Core Requirements Cross-signal correlation @phennex kaspernissen.xyz

Slide 39

Slide 39

Thatʼs all great, but I ask again, how do I make it easily accessible for my developers? @phennex kaspernissen.xyz

Slide 40

Slide 40

The answer: Auto-instrumentation + Operators = No-touch Instrumentation @phennex kaspernissen.xyz

Slide 41

Slide 41

OpenTelemetry Operator Instrumentation OpenTelemetryCollector OpAMPBridge OpenTelemetry Operator @phennex kaspernissen.xyz TargetAllocator

Slide 42

Slide 42

Auto-Instrumentation with the OpenTelemetry Operator Instrumentation Instructs how to inject auto-instrumentation Injects instrumentation in to the pod OpenTelemetry Operator @phennex kaspernissen.xyz

Slide 43

Slide 43

Observability doesnʼt stop at instrumentation. @phennex kaspernissen.xyz

Slide 44

Slide 44

Perses An open specification for dashboards. CNCF Sandbox project @phennex kaspernissen.xyz

Slide 45

Slide 45

Dashboards as Code Perses PersesDatasource perses-operator @phennex kaspernissen.xyz PersesDashboard

Slide 46

Slide 46

Demo PersesDatasource instr. todo-go PersesDashboard Perses Operator todo-java Perses Postgres MySQL Prometheus Inject eBPF-sidecar Inject Java Agent OpenTelemetry Operator Instrumentation @phennex kaspernissen.xyz OpenTelemetry Collector Jaeger

Slide 47

Slide 47

Recap PersesDatasource instr. todo-go PersesDashboard Perses Operator todo-java Perses Postgres MySQL Prometheus Inject eBPF-sidecar Inject Java Agent OpenTelemetry Operator Instrumentation @phennex kaspernissen.xyz OpenTelemetry Collector Jaeger

Slide 48

Slide 48

Observability is evolving - fast. @phennex kaspernissen.xyz

Slide 49

Slide 49

OpenTelemetry is standardizing telemetry collection. @phennex kaspernissen.xyz

Slide 50

Slide 50

Perses is standardizing dashboarding. @phennex kaspernissen.xyz

Slide 51

Slide 51

Applying Platform Engineering principles can transform observability from an afterthought into a seamless, scalable, and developer-friendly experience. @phennex kaspernissen.xyz

Slide 52

Slide 52

Observability is a systems problem - not a tracing, logging, or metrics problem. @phennex kaspernissen.xyz

Slide 53

Slide 53

When we connect signals together, we empower developers to solve problems faster. @phennex kaspernissen.xyz

Slide 54

Slide 54

And last but not least, Building on Open Standards allows you to freely move between vendors, ensuring they stay on their toes and provide you the best possible experience. @phennex kaspernissen.xyz

Slide 55

Slide 55

Shameless plug: OTelBin Forever free, OSS Editing, visualization and validation of OpenTelemetry Collector configurations With ❤ by Dash0! https://www.otelbin.io/ @phennex kaspernissen.xyz

Slide 56

Slide 56

Thank you! Get in touch! Kasper Borg Nissen, Developer Advocate at Demo can be found here! https://github.com/dash0hq/container-days-ham burg-2025 @phennex kaspernissen.xyz