Designing Payloads for Event-Driven Systems

A presentation at Kafka Summit Europe 2021 in May 2021 in by Lorna Jane Mitchell

Slide 1

Slide 1

Designing Payloads for Event-Driven Systems Lorna Mitchell, Aiven.io

Slide 2

Slide 2

Event-Driven Systems @lornajane

Slide 3

Slide 3

Payloads The messages the machines send between themselves • Large string/binary data • No rules • (but I do have advice!) @lornajane

Slide 4

Slide 4

Payload Design Tips

Slide 5

Slide 5

Apache Kafka Records Use all the features of Apache Kafka’s records @lornajane

Slide 6

Slide 6

Header • Metadata about the main payload • Available without deserializing source_type: sensor trace_id: 1b15c98e-a52a-443d @lornajane

Slide 7

Slide 7

Key • Key sets the partition • Can include multiple fields { “type”: “sensor_reading”, “factory_id”: 44891 } @lornajane

Slide 8

Slide 8

Flat or Nested Structures? • Always use a top level object structure (not an array) • Group related fields together { “stores_request_id”: 10004352789, “parent_order”: { “order_ref”: 777289, “agent”: “Mr Thing (1185)” }, “bom”: [ {“part”: “hinge_cup_sg7”, “quantity”: 18}, {“part”: “worktop_kit_sm”, “quantity”: 1}, {“part”: “softcls_norm2”, “quantity”: 9} ]} @lornajane

Slide 9

Slide 9

More Data or Less Data? • For small payloads, add the context fields • Use lightweight representation rather than the full object • Be careful of triggering many extra lookups • Hypermedia can help @lornajane

Slide 10

Slide 10

Example: GitHub Webhooks (snippet from the push webhook) “user”: { “login”: “Codertocat”, “id”: 21031067, “avatar_url”: “https://avatars1.githubusercontent.com/u/21031067?v=4”, “url”: “https://api.github.com/users/Codertocat”, “html_url”: “https://github.com/Codertocat”, “followers_url”: “https://api.github.com/users/Codertocat/followers”, “following_url”: “https://api.github.com/users/Codertocat/following{/other_us “gists_url”: “https://api.github.com/users/Codertocat/gists{/gist_id}”, “starred_url”: “https://api.github.com/users/Codertocat/starred{/owner}{/repo “organizations_url”: “https://api.github.com/users/Codertocat/orgs”, “repos_url”: “https://api.github.com/users/Codertocat/repos”, “type”: “User”, }, @lornajane

Slide 11

Slide 11

A Note on Timestamps • Apache Kafka includes publish time in the header. • Consider adding payload-level timestamps. • Timestamps only as accurate as your clock! Pick a standard, any standard! 1615910306 or 2021-05-11T10:58:26Z @lornajane

Slide 12

Slide 12

Event Tracing Standards are great! https://opentelemetry.io • Trace ID used by every event in the story • Span ID in event, becomes Parent Span ID for child (beautiful graph from honeycomb.io) @lornajane

Slide 13

Slide 13

Using and Evolving Schemas

Slide 14

Slide 14

Data Formats Some formats require schemas. • JSON: text-based, few data types, schema optional • XML: text-based, stronger typing, schema optional • Language-Specific Serialization: (it depends!) • Protobuf: binary format, handled by generated code • Avro: binary format, schema required @lornajane

Slide 15

Slide 15

Schemas Schemas enforce payload structure • Refer to schema for field names • Specify which schema/version for the record • Register schemas with a Schema Registry @lornajane

Slide 16

Slide 16

Example: Avro Schema Avro schema example for sensor data { “namespace”: “io.aiven.example”, “type”: “record”, “name”: “MachineSensor”, “fields”: [ {“name”: “machine”, “type”: “string”, “doc”: “The machine whose sensor this is”}, {“name”: “sensor”, “type”: “string”, “doc”: “Which sensor was read”}, {“name”: “value”, “type”: “float”, “doc”: “Sensor reading”}, {“name”: “units”, “type”: “string”, “doc”: “Measurement units”} ] } @lornajane

Slide 17

Slide 17

Evolving Schemas • To rename fields, add the new field, keep the old one • Safe to add optional fields • Each change is a new version • Avro supports aliases and default values • Include versions in topic names, just in case @lornajane

Slide 18

Slide 18

Describing Payloads

Slide 19

Slide 19

AsyncAPI for Apache Kafka AsyncAPI describes event-driven architectures https://www.asyncapi.com We can describe the: • brokers and auth • topics • payloads @lornajane

Slide 20

Slide 20

Describing Payloads The channels section of the AsyncAPI document factorysensor: subscribe: operationId: MachineSensor summary: Data from the in-machine sensors bindings: kafka: clientId: type: string message: name: sensor-reading title: Sensor Reading schemaFormat: “application/vnd.apache.avro;version=1.9.0” payload: $ref: machine_sensor.avsc @lornajane

Slide 21

Slide 21

Documenting Payloads @lornajane

Slide 22

Slide 22

Payloads and Event-Driven Systems Design with intention, embrace standards

Slide 23

Slide 23

Resources • These slides: https://noti.st/lornajane/Z759DJ • Examples: https://github.com/aiven/thingum-industries • Aiven: https://aiven.io • Coupon code: ks2021aiven • Karapace: https://karapace.io • AsyncAPI: https://asyncapi.com @lornajane