Controlling Wildfires (While Only Getting Singed).

A presentation at NDC London in in London, UK by Jessica White

No software system is perfect. Deep down we all like that because it keeps things interesting, but what do you do when your system is further away from perfection than you’re comfortable with? It may be you are dealing with a legacy monolith or a system that has strange logic, no documentation and hardly any conventions you can recognise. Everything seems to be on fire and you need to stabilise the situation.

We’ll explore some techniques that are useful when working on systems that need a little love. At a high-level, we will go through approaching testing, monitoring, re-architecting and documenting these systems. We will also go through prioritisation when everything is urgent and the non-technical metrics that can help with buy-in for stabilising your system and team.

This talk is an intermediate level talk. The aim is for you to not only learn how to deal with these situations but want to tackle the challenges they present and be excited about the change you can lead.

Resources

The following resources were mentioned during the presentation or are useful additional information.

  • DevOps for Dummies by Emily Freeman

    Develop faster with DevOps DevOps embraces a culture of unifying the creation and distribution of technology in a way that allows for faster release cycles and more resource-efficient product updating. DevOps For Dummies provides a guidebook for those on the development or operations side in need of a primer on this way of working. Inside, DevOps evangelist Emily Freeman provides a roadmap for adopting the management and technology tools, as well as the culture changes, needed to dive head-first into DevOps. Identify your organization’s needs Create a DevOps framework Change your organizational structure Manage projects in the DevOps world DevOps For Dummies is essential reading for developers and operations professionals in the early stages of DevOps adoption

  • Brendan Gregg - The USE method

  • RED is mentioned and explained in this blog

  • Google SRE - The Four Golden Signals

    In the "Monitoring Distributed Systems" Chapter of the Google SRE book

Buzz and feedback

Here’s what was said about this presentation on Twitter.