The Adventurer’s Guide To Breaking Production

A presentation at Client event (commissioned keynote) in September 2021 in by Holly Cummins

Slide 1

Slide 1

the adventurer’s guide to breaking production Holly Cummins @holly_cummins @holly_cummins #IBM

Slide 2

Slide 2

me: “innovation leader” at IBM @holly_cummins #IBM

Slide 3

Slide 3

me: “innovation leader” at IBM translation: getting into trouble with technology. @holly_cummins #IBM

Slide 4

Slide 4

me: “innovation leader” at IBM translation: getting into trouble with technology. … for 20 years @holly_cummins #IBM

Slide 5

Slide 5

how often do we get to go someplace truly new? @holly_cummins #IBM

Slide 6

Slide 6

see things we’ve never seen before? @holly_cummins #IBM

Slide 7

Slide 7

in software we do it all the time @holly_cummins #IBM

Slide 8

Slide 8

ancient landscapes @holly_cummins #IBM

Slide 9

Slide 9

legacy environments @holly_cummins #IBM

Slide 10

Slide 10

client name redacted @holly_cummins new territories #IBM

Slide 11

Slide 11

@holly_cummins #IBM

Slide 12

Slide 12

what could possibly go wrong? @holly_cummins #IBM

Slide 13

Slide 13

um … what problem were we trying to solve? @holly_cummins #IBM

Slide 14

Slide 14

George Mallory Died on Everest… 30 years before Edmund Hillary because it’s there @holly_cummins #IBM new for the sake of it

Slide 15

Slide 15

surprise inside @holly_cummins #IBM

Slide 16

Slide 16

the pace of change is fast @holly_cummins #IBM

Slide 17

Slide 17

the pace of change is fast @holly_cummins #IBM

Slide 18

Slide 18

the landscape is complicated @holly_cummins #IBM

Slide 19

Slide 19

the old stuff hasn’t gone away @holly_cummins #IBM

Slide 20

Slide 20

stuff slows us down @holly_cummins #IBM

Slide 21

Slide 21

@holly_cummins #IBM

Slide 22

Slide 22

@holly_cummins #IBM

Slide 23

Slide 23

and … @holly_cummins #IBM

Slide 24

Slide 24

ok but what’s the worst that could go wrong? @holly_cummins #IBM

Slide 25

Slide 25

Knight Capital $460 million loss @holly_cummins #IBM

Slide 26

Slide 26

Knight Capital $460 million loss in 45 minutes @holly_cummins #IBM

Slide 27

Slide 27

@holly_cummins #IBM

Slide 28

Slide 28

the million-dollar frozen database @holly_cummins #IBM

Slide 29

Slide 29

true story, unfortunately Hey boss, I created a Kubernetes cluster. Holly @holly_cummins #IBM

Slide 30

Slide 30

true story, unfortunately Hey boss, I created a Kubernetes cluster. I forgot it for 2 months. Holly @holly_cummins #IBM

Slide 31

Slide 31

true story, unfortunately Hey boss, I created a Kubernetes cluster. I forgot it for 2 months. … and it’s £1000 a month. Holly @holly_cummins #IBM

Slide 32

Slide 32

bugs, but in space Phobos 1 @holly_cummins #IBM

Slide 33

Slide 33

“we couldn’t get the automated checks to work, so we bypassed them” @holly_cummins #IBM

Slide 34

Slide 34

“the space probe is bricked.” @holly_cummins #IBM

Slide 35

Slide 35

what causes bugs? @holly_cummins #IBM

Slide 36

Slide 36

other people @holly_cummins #IBM

Slide 37

Slide 37

us @holly_cummins #IBM

Slide 38

Slide 38

@holly_cummins #IBM

Slide 39

Slide 39

interactions @holly_cummins #IBM

Slide 40

Slide 40

“every time we change one microservice, another breaks” #IBM @holly_cummins

Slide 41

Slide 41

distributed != decoupled @holly_cummins #IBM

Slide 42

Slide 42

managing bugs @holly_cummins #IBM

Slide 43

Slide 43

breaking production isn’t the worst thing @holly_cummins #IBM

Slide 44

Slide 44

as long as it’s a small break @holly_cummins #IBM

Slide 45

Slide 45

as long as it’s a tiny break @holly_cummins #IBM

Slide 46

Slide 46

limit blast radius @holly_cummins #IBM

Slide 47

Slide 47

canary deploys @holly_cummins #IBM

Slide 48

Slide 48

breaking production isn’t the worst thing the important thing is how fast you can unbreak production @holly_cummins #IBM

Slide 49

Slide 49

recoverability @holly_cummins #IBM

Slide 50

Slide 50

unrecoverable @holly_cummins #IBM

Slide 51

Slide 51

unbreak @holly_cummins #IBM

Slide 52

Slide 52

unbreak diagnose @holly_cummins #IBM

Slide 53

Slide 53

unbreak deploy diagnose @holly_cummins #IBM

Slide 54

Slide 54

unbreak deploy diagnose observability @holly_cummins #IBM

Slide 55

Slide 55

unbreak deploy diagnose observability @holly_cummins #IBM devops

Slide 56

Slide 56

unbreak deploy diagnose observability @holly_cummins #IBM devops

Slide 57

Slide 57

my most embarrassing break of production @holly_cummins #IBM

Slide 58

Slide 58

my most embarrassing break of production @holly_cummins #IBM

Slide 59

Slide 59

my most embarrassing break of production @holly_cummins #IBM

Slide 60

Slide 60

most problems are harder to diagnose @holly_cummins #IBM

Slide 61

Slide 61

observability @holly_cummins #IBM

Slide 62

Slide 62

observability what you don’t have to do … if you have observability @holly_cummins #IBM

Slide 63

Slide 63

unbreak deploy diagnose observability @holly_cummins #IBM devops

Slide 64

Slide 64

make releases deeply boring @holly_cummins #IBM

Slide 65

Slide 65

so you can do them make all the time releases deeply boring @holly_cummins #IBM

Slide 66

Slide 66

CI/CD rst code second #IBM fi @holly_cummins

Slide 67

Slide 67

GitOps @holly_cummins #IBM

Slide 68

Slide 68

GitOps infrastructure as code @holly_cummins #IBM

Slide 69

Slide 69

ok but preventing problems? @holly_cummins #IBM

Slide 70

Slide 70

pair programming @holly_cummins #IBM

Slide 71

Slide 71

test-driven development (TDD) @holly_cummins #IBM

Slide 72

Slide 72

if you care about it, automate it @holly_cummins #IBM

Slide 73

Slide 73

integrate early and often @holly_cummins #IBM

Slide 74

Slide 74

integrate early and often many times a day @holly_cummins #IBM

Slide 75

Slide 75

contract test your interactions @holly_cummins #IBM

Slide 76

Slide 76

the problem with mocks @holly_cummins our code #IBM their code

Slide 77

Slide 77

the problem with mocks @holly_cummins our code #IBM our mock

Slide 78

Slide 78

the problem with mocks our code tests ✔ @holly_cummins #IBM our mock

Slide 79

Slide 79

the problem with mocks our code our mock tests ✔ @holly_cummins #IBM our code their actual code

Slide 80

Slide 80

the problem with mocks our code our mock tests ✔ reality ✘ @holly_cummins #IBM our code their actual code

Slide 81

Slide 81

the problem with mocks @holly_cummins our code #IBM their code

Slide 82

Slide 82

the problem with mocks @holly_cummins our code #IBM contract test their code

Slide 83

Slide 83

mock the problem with mocks @holly_cummins our code #IBM contract test their code

Slide 84

Slide 84

mock the problem with mocks @holly_cummins our code #IBM functional test contract test their code

Slide 85

Slide 85

mock the problem with mocks our code functional test contract test our tests ✔ their tests ✔ reality ✔ @holly_cummins #IBM their code

Slide 86

Slide 86

mock the problem with mocks @holly_cummins our code #IBM functional test contract test their code

Slide 87

Slide 87

mock the problem with mocks our code functional test contract test our tests ✔ their tests ✘ reality ✘ @holly_cummins #IBM their code

Slide 88

Slide 88

mock the problem with mocks @holly_cummins our code #IBM functional test contract test their code

Slide 89

Slide 89

mock the problem with mocks our code functional test contract test our tests ✘ their tests ✔ reality ✘ @holly_cummins #IBM their code

Slide 90

Slide 90

@holly_cummins #IBM

Slide 91

Slide 91

demo @holly_cummins #IBM

Slide 92

Slide 92

2014 2021 Ant Tekton Java 7 OSGi WebSphere Kubernetes OpenShift Node.js React.js locally deployed on public cloud my stack @holly_cummins #IBM

Slide 93

Slide 93

your ability to learn is a key professional asset @holly_cummins #IBM

Slide 94

Slide 94

teach people the stuff you’re learning @holly_cummins #IBM

Slide 95

Slide 95

@holly_cummins #IBMGarage

Slide 96

Slide 96

2008: a developer had a lot of fun with Groovy. @holly_cummins #IBMGarage

Slide 97

Slide 97

2008: a developer had a lot of fun with Groovy. 2009: he left the company; the others who had to maintain his code had less fun. @holly_cummins #IBMGarage

Slide 98

Slide 98

pair programming @holly_cummins #IBM

Slide 99

Slide 99

the value of discomfort @holly_cummins #IBM

Slide 100

Slide 100

@holly_cummins #IBM

Slide 101

Slide 101

TDD (test driven development) @holly_cummins #IBM

Slide 102

Slide 102

TDD (test driven development) BDD (behaviour driven development) @holly_cummins #IBM

Slide 103

Slide 103

TDD (test driven development) BDD (behaviour driven development) CDD (cake driven development) @holly_cummins #IBM

Slide 104

Slide 104

TDD (test driven development) BDD (behaviour driven development) CDD (cake driven development) PDD (pain driven development) @holly_cummins #IBM

Slide 105

Slide 105

harness discomfort to drive innovation @holly_cummins #IBM

Slide 106

Slide 106

learning comes from failure @holly_cummins #IBM

Slide 107

Slide 107

success comes from learning @holly_cummins #IBM

Slide 108

Slide 108

thank you! (and have fun at the rest of the event) Holly Cummins @holly_cummins @holly_cummins #IBM