Rethinking Observability as a Platform Capability

A presentation at Tech Hub Aarhus Day in March 2026 in Aarhus, Denmark by Kasper Borg Nissen

Slide 1

Slide 1

Tech Hub Aarhus Day Rethinking Observability as a Platform Capability Kasper KasperBorg BorgNissen, Nissen,Principal PrincipalDeveloper DeveloperAdvocate Advocateat Dash0 kaspernissen.xyz /in/kaspernissen kaspernissen

Slide 2

Slide 2

Tech Hub Aarhus Day Rethinking Observability as a Platform Capability Product Kasper KasperBorg BorgNissen, Nissen,Principal PrincipalDeveloper DeveloperAdvocate Advocateat Dash0 kaspernissen.xyz /in/kaspernissen kaspernissen

Slide 3

Slide 3

Who? Principal Developer Advocate at Dash0 (previously Lunar) KubeCon+CloudNativeCon EU/NA 24/25 Co-Chair (former) Author of OpenTelemetry for Dummies CNCF Ambassador Golden Kubestronaut CNCG Aarhus, KCD Denmark Organizer Co-founder & Community Lead Cloud Native Nordics

Slide 4

Slide 4

Let’s talk about Kubernetes

Slide 5

Slide 5

Kubernetes is a platform for building platforms Bryan Liles KubeCon+CloudNativeCon in San Diego 2019

Slide 6

Slide 6

The Rails Moment

Slide 7

Slide 7

Cognitive Load Cognitive load over time 2000 2005 2010 2015 2020 2026 Visualization Inspired by Daniel Bryant’s talk

Slide 8

Slide 8

🤯 Cognitive load over time Cognitive Load Microservices/ k8s 🤔 😊 Monolith/SO A Monolith/ Heroku 🥳 Monolith 2000 🤗 2005 2010 Microservices/ Platform Engineering 2015 2020 2026 Visualization Inspired by Daniel Bryant’s talk

Slide 9

Slide 9

Why Platform Engineering Emerged �� Developer Kubernetes

Slide 10

Slide 10

Why Platform Engineering Emerged Developer ? Kubernetes

Slide 11

Slide 11

Why Platform Engineering Emerged 🤗 Developer Platform Kubernetes

Slide 12

Slide 12

A digital platform is a foundation of self-service APIs, tools, services, knowledge and support arranged as a compelling internal product. Evan Bottcher https://martinfowler.com/articles/talk-about-platforms.html

Slide 13

Slide 13

Team Topologies ◗ Platform Team: a grouping of other team types that provide a compelling internal product to accelerate delivery by Stream-aligned teams

Slide 14

Slide 14

Platform as a Product ◗ Platform as a Product (PaaP) is an approach where internal platforms are treated as evolving products with defined users (developers, operations teams) and lifecycles.

Slide 15

Slide 15

The missing abstraction 🥳 Developer Product 1 Product 2 Platform Kubernetes Product 3

Slide 16

Slide 16

Observability Today… Same mistakes, different domain

Slide 17

Slide 17

Observability promised a lot Faster root cause analysis Understanding system behavior Lower MTTR Where do I start? Which tool is right? Why is this metric spiking? Reduced guesswork Human correlation Expectation Reality

Slide 18

Slide 18

LOGS METRICS level=DEBUG level=debug LEVEL=DEBUG 100s/1000s of Microservices TRACES RUM PROFILING DISTRIBUTED SYSTEMS Finding the needle in the haystack

Slide 19

Slide 19

The current reality: fragmentation

Slide 20

Slide 20

The current reality: fragmentation Complex Query Languages

Slide 21

Slide 21

The current reality: fragmentation Complex Query Languages Vendor lock-in

Slide 22

Slide 22

The current reality: fragmentation Complex Query Languages Vendor lock-in Metadata Inconsistency

Slide 23

Slide 23

The current reality: fragmentation Complex Query Languages Vendor lock-in No instrumentation due to high complexity Metadata Inconsistency

Slide 24

Slide 24

The current reality: fragmentation Complex Query Languages Vendor lock-in No instrumentation due to high complexity Metadata Inconsistency Lack of unified insights

Slide 25

Slide 25

The cost & complexity paradox More telemetry, more tooling - same time to recovery Cost Relative Impact Complexity MTTR Time

Slide 26

Slide 26

Up to 84% of current observability users struggle with the costs and complexity of their daily monitoring responsibilities. Gartner Hype Cycle Report, 2025 Source: https://www.gartner.com/en/documents/6755734 26

Slide 27

Slide 27

This isn’t necessarily a tooling problem We don’t have a metrics problem, or a tracing problem. We have systems problems. Metrics Logs Traces

Slide 28

Slide 28

Let’s stop talking about the three pillars of observability … Kill The Three Pillars Manifesto Metrics Logs Traces

Slide 29

Slide 29

Same mistake as early Kubernetes Powerful primitives - no default experience All valid tools. No opinionated workflow.

Slide 30

Slide 30

A shift toward correlation Find related information Jump between signals Reconstruct chain of events

Slide 31

Slide 31

A shift toward…

Slide 32

Slide 32

OpenTelemetry OpenTelemetry (OTel) is an open source project designed to provide standardized tools and APIs for generating, collecting, and exporting telemetry data such as traces, metrics, and logs The de-facto standard for distributed tracing, supports metrics, logs, profiling & RUM The main goals of the project are: Unified telemetry Vendor neutrality Cross platform

Slide 33

Slide 33

OpenTelemetry in a nutshell OpenTelemetry is a toolkit and a specification. What it is ◗ ◗ ◗ ◗ ◗ ◗ Data models API specifications Semantic conventions Library implementations in many languages Utilities and much more What it is NOT ◗ ◗ ◗ ◗ ◗ ◗ Proprietary An all-in-one observability tool A data storage or dashboarding solution A query language A Performance Optimizer Feature complete

Slide 34

Slide 34

1/1/20251/1/2026 Commits: 37.959 PRs+Issues: 46.709 Commits: 53.495 PRs+Issues: 40.597 Source: CNCF Velocity Report

Slide 35

Slide 35

49% of respondents using OpenTelemetry in production. 26% of respondents evaluating OpenTelemetry. Source: https://www.cncf.io/wp-content/uploads/2026/01/CNCF_Annual_Survey_Report_final.pdf

Slide 36

Slide 36

Signals METRICS 42 LOGS 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 TRACES PROFILES RUM

Slide 37

Slide 37

Correlation is the superpower METRICS 42 LOGS 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 20/JUN/2025 “GET / HTTP/1.1ˮ 200 TRACES PROFILES RUM

Slide 38

Slide 38

OpenTelemetry: A 1000 miles view Instrumentation OTel API & SDK Telemetry Backends The OpenTelemetry Collector auto-instrumentation Time-series database … Log database Receive Process Analysis Tools Export Trace database Infrastructure … Kubernetes … Generate and Emit transmit Collect, Convert, Process, Route, Export transmit Store & Analyze Inspired by visualizations from LFS148

Slide 39

Slide 39

OpenTelemetry: A 1000 miles view OTel API & SDK Collection of Telemetry is standardized Vendor space The OpenTelemetry Collector auto-instrumentation … Receive Process Export Infrastructure Kubernetes … “The last observability agent you will ever install” Generate and Emit transmit Collect, Convert, Process, Route, Export … and many more. transmit Store & Analyze

Slide 40

Slide 40

Telemetry without context is just data

Slide 41

Slide 41

What are we looking at?

Slide 42

Slide 42

What are we looking at? Awww… Adorable! Cute Cuteness Pretty Normal Unfortunate Creepy Reddit /r/funny, “Cuteness Vs Number of legs” (circa 2010) Gaah! Kill it! Kill it! 0 1 2 3 4 5 Number of Legs 6 7 8

Slide 43

Slide 43

How we talk about system context Organization (By whom) 1 Architecture (What / Why) Which service / system component is this? 2 Compute (How/2) 3 Platform (How) Kubernetes? Which cluster / namespace / deployment / cronjob / job / pod? AWS ECS? Which cluster / service / task? … Which team owns it? “Who you gonna call?” .. 4 Which container? Which process? Pid? Startup args? Which runtime is it? Node.js? JVM? .NET? Which build? Which version? … Infrastructure (Where) 5 Which datacenter / Cloud region / availability zone / account does it run in? …

Slide 44

Slide 44

OpenTelemetry semantic conventions to context layers 1 Organization 😢 Architecture Service (stable) and (experimental) Deployment Environment 2 Compute 3 Platform Kubernetes Cloud (cloud.platform specifically) Cloud-provider specific 4 COM NOT PRE A HE LIST NSIVE ! Telemetry SDK (stable) and (experimental) Compute Unit and Instance Operating System Process & Process Runtimes Device, Browser, Webengine, … … 5 Infrastructure Cloud (general stuff)

Slide 45

Slide 45

The Platform Team’s Role Observe the platform Enable developers cloud, cluster, CI/CD, shared DBs, etc. traces, metrics, logs, profiling

Slide 46

Slide 46

What does “Observability as a Product” mean? The platform absorbs complexity so teams can focus on understanding systems. Observability as tooling ◗ ◗ ◗ ◗ Terminal YAML Dashboards with empty states Many knobs Observability as a Product ◗ ◗ ◗ ◗ Clean UI Pre-populated dashboards Correlated views “It just works”

Slide 47

Slide 47

Paved Paths for Observability 󰠁 Paved Observability Path Logs Metrics Storage Traces Collectors Correlation Engine Instrumentation

Slide 48

Slide 48

Auto-instrumentation changes the game Manual Instrumentation (fully code-based) Automatic Instrumentation (agent, binaries) No-touch Instrumentation (or zero-code)

Slide 49

Slide 49

Operators as the delivery mechanism Instrumentation Instructs how to inject auto-instrumentation Injects instrumentation in to the pod OpenTelemetry Operator

Slide 50

Slide 50

Observability doesn’t stop at instrumentation Vendors How humans and agents understand the system Explorers Dashboards Alerting … and many more. Synthetic Checks Service Maps Agents Your environment (k8s, cloud, etc) OSS How telemetry is produced and collected Where telemetry lives and how it’s accessed Cost Insights

Slide 51

Slide 51

Why AI needs a platform first Garbage In Garbage Out

Slide 52

Slide 52

Why AI needs a platform first Specifications and Semantic Conventions Your Telemetry

Slide 53

Slide 53

The future interaction model

Slide 54

Slide 54

Demo

Slide 55

Slide 55

So why, OpenTelemetry? Instrument once, use everywhere Separate telemetry generation from analysis Make software observable by default Improve how we use telemetry

Slide 56

Slide 56

Why Observability as Platform Product? ◗ ◗ ◗ ◗ Reduce cognitive load Correlation by default Structure, standardized telemetry Foundation for AI-driven workflows Developers can focus on delivering business value, with observability as a built-in safety net.

Slide 57

Slide 57

https://university.platformengineering.org/observability-for-platform-engineering

Slide 58

Slide 58

Book Get a free copy

Slide 59

Slide 59

Tech Hub Aarhus Day Thank you! Kasper KasperBorg BorgNissen, Nissen,Principal PrincipalDeveloper DeveloperAdvocate Advocateat Dash0 kaspernissen.xyz /in/kaspernissen kaspernissen

Slide 60

Slide 60

Tech Hub Aarhus Day Get in touch! kaspernissen.xyz /in/kaspernissen kaspernissen