Slide 1
Golden Paths for Async Workflows: Dapr Meets OpenTelemetry Mauricio “Salaboy” Salatino Kasper Borg Nissen
Slide 2
Who?
Mauricio Salatino
Kasper Borg Nissen
Ecosystem Engineer & Passionate about Open Source Platform Engineering on Kubernetes Author
Principal Developer Advocate at Dash0 Former KubeCon Co-Chair NA/EU CNCF Ambassador Golden Kubestronaut CNCG Aarhus Cloud Native Denmark Cloud Native Nordics
Slide 3
TLTW (too long to watch)
If your applications and infrastructure grows in complexity you must have the right tools to understand what is going on at all times.
Slide 4
Slide 5
The Rise of Async Microservices
Slide 6
Limitations with synchronous communication
Slide 7
Slide 8
Synchronous communication is familiar, Asynchronous communication is powerful, and in real systems you need both to work seamlessly together.
Slide 9
Slide 10
Slide 11
The Golden Path - Dapr + OpenTelemetry
Slide 12
The CNCF Opportunity: Shared Abstractions
Slide 13
Slide 14
Dapr Building Blocks • APIs to help developers build scalable and resilient distributed applications • PubSub, Workflows, Secrets/Configs, Conversation (LLMs), etc • All these APIs, behind the covers, implement cross-cutting concerns • Security • Resilience • Observability
Slide 15
Slide 16
Slide 17
Slide 18
Slide 19
Slide 20
How does it work?
helm install dapr
Slide 21
Sidecars to the rescue
The application can use the PubSub APIs to publish messages
Slide 22
Service To Service Invocation API No need to complicate application logic with retries or CBs.
Slide 23
Ok, but what happen when things go wrong? • What happens if the kitchen service is down and the retries are exhausted? • What happens if Kafka is down? • We cannot leave our pizza customers without their pizzas!
Slide 24
Dapr Workflows: Resilient orchestrations • Workflows are defined in code, executed by the Dapr sidecar • Durable, long-running state management • Retries, timers, wait-for-events all included • No single point of failure or SaaS service needed • Workflows will keep trying no matter what goes down! (even the workflow runtime!!!)
Slide 25
Slide 26
Slide 27
Slide 28
Slide 29
Slide 30
OpenTelemetry OpenTelemetry (OTel) is an open source project designed to provide standardized tools and APIs for generating, collecting, and exporting telemetry data such as traces, metrics, and logs The de-facto standard for distributed tracing, supports metrics, logs, RUM, and profiling (experimental) The main goals of the project are: • Unified telemetry • Vendor-neutrality • Cross-platform
Slide 31
1/1/20241/1/2025
Commits: 27.168 PRs+Issues: 58.508
Source: CNCF Velocity
Commits: 44.486 PRs+Issues: 56.299
Slide 32
Slide 33
Slide 34
Slide 35
Slide 36
Making sense of all the complexity
Slide 37
Two Perspectives, One Goal
Slide 38
Who Owns Tracing? A Hidden Conundrum
Slide 39
Slide 40
Slide 41
import io.opentelemetry.api.GlobalOpenTelemetry; import io.opentelemetry.api.trace.Tracer; Tracer tracer = GlobalOpenTelemetry.getTracer(“application”); Span span = tracer.spanBuilder(“doWork”).startSpan(); … span.end();
Slide 42
Why Observability is Critical with Dapr
Slide 43
Why Async Is Hard to Observe
Slide 44
A Shared Pain: Context Gets Lost
Slide 45
Trace propagation with Dapr
Slide 46
Context Propagation for Async Workflows
Slide 47
Slide 48
Slide 49
Slide 50
Slide 51
What Works Today • Dapr supports OpenTelemetry out of the box • Sidecar emits spans for pub/sub, service invocation, and workflows • OpenTelemetry Operator enables auto-instrumentation • OpenTelemetry Collector handles ingestion, processing, export
Slide 52
Slide 53
Challenges • Async boundaries break context • Sidecars add additional hops • Workflow engines introduce thread + process separation
Slide 54
Context Propagation for Async Workflows
Slide 55
Challenges with gRPC streaming
Slide 56
Gaps and Fixes in Dapr & OTel
PR #57
PR #9213
PR #46
Trace context, SemConv
Trace context, SemConv, Pub/Sub Span kind
Propagating context to executors client side
Slide 57
Enabling the Golden Path…
Slide 58
Slide 59
Slide 60
Abstract Async workflows power modern microservices, but they can be notoriously hard to observe. In this talk, we show how two CNCF projects - Dapr, for developer-friendly building blocks, and OpenTelemetry, for unified observability create a golden path that bridges developer productivity and platform reliability. We’ll start by using Dapr Workflows and Pub/Sub to connect and orchestrate services without boilerplate. Then we’ll add the OpenTelemetry Operator for no-touch instrumentation, instantly delivering traces, metrics, and logs - even across asynchronous boundaries. You’ll see current OpenTelemetry capabilities for tracking async requests end-to-end, where the gaps are today, and practical ways to correlate events in complex workflows. Through a live demo, we’ll prove that with the right abstractions, shipping features fast and observing systems deeply can go hand in hand.