Reliability is no Accident

A presentation at SLOconf Monthly Meetup by Julie Gunderson

Over the years a lot of research has been conducted and many books have been written on how to improve the resilience of our software. This talk will dive deep into the three keep practices identified by the authors of Accelerate to improve reliability: Chaos Engineering, GameDays, and Disaster Recovery. We will discuss the key measures of tempo and stability, and how practicing Chaos Engineering will increase both.

We will be walking through the Google Cloud open source Bank of Anthos application to illustrate why teams should focus on the customer experience and how to test for failures.

Attendees will learn practical tips that you can put into action focused on resource consumption, capacity planning, region failover, decoupling services and deployment pain