Controlling Wildfires (While Only Getting Singed).

A presentation at NDC Porto by Jessica White

Controlling Wildfires While Only Getting Singed.

Controlling Wildfires While Only Getting Singed.

Pre-loved systems

Pre-loved systems

Where do I start?

Where do I start?

The Battle Plan

The Battle Plan

Language

Language

Start on the same page

Start on the same page

Understanding Domain

Understanding Domain

Users and Stakeholders

Users and Stakeholders

Talking through domain

Talking through domain

Key Performance Indicators

Key Performance Indicators

Assessing components

Assessing components

Reactive Work

Reactive Work

SRE Quote

SRE Quote

Steps for Reactive Work

Steps for Reactive Work

Living Documentation

Living Documentation

Documentation as measurable work?

Documentation as measurable work?

React and Evolve

React and Evolve

Proactive Work

Proactive Work

The Shit List

The Shit List

Shit List Concerns

Shit List Concerns

Keep it visible, Keep it safe

Keep it visible, Keep it safe

Prioritising when everything is urgent

Prioritising when everything is urgent

Letting things burn

Letting things burn

Transparency

Transparency

Cost/Benefit Analysis

Cost/Benefit Analysis

Avoiding Personal Burnout

Avoiding Personal Burnout

Deployment

Deployment

“How do I know it’s working?”

“How do I know it’s working?”

Monitoring

Monitoring

Key Performance Indicators

Key Performance Indicators

Systems Monitoring

Systems Monitoring

SLOs, SLAs, SLIs

SLOs, SLAs, SLIs

Service Level Objectives

Service Level Objectives

Service Level Agreements

Service Level Agreements

Service Level Indicators

Service Level Indicators

Error Budgets

Error Budgets

SLOs, SLAs, SLIs Example

SLOs, SLAs, SLIs Example

Testing

Testing

Start from the top

Start from the top

But what about the unit tests?

But what about the unit tests?

In Summary

In Summary

This content will happen in parallel

This content will happen in parallel

You can’t fix the world

You can’t fix the world

No software system is perfect. Deep down we all like that because it keeps things interesting, but what do you do when your system is further away from perfection than you’re comfortable with? It may be you are dealing with a legacy monolith or a system that has strange logic, no documentation and hardly any conventions you can recognise. Everything seems to be on fire and you need to stabilise the situation.

We’ll explore some techniques that are useful when working on systems that need a little love. At a high-level, we will go through approaching testing, monitoring, re-architecting and documenting these systems. We will also go through prioritisation when everything is urgent and the non-technical metrics that can help with buy-in for stabilising your system and team.

This talk is an intermediate level talk. The aim is for you to not only learn how to deal with these situations but want to tackle the challenges they present and be excited about the change you can lead.

Resources

The following resources were mentioned during the presentation or are useful additional information.

Buzz and feedback

Here’s what was said about this presentation on Twitter.