Tales From The DevOps Transformation Trenches

A presentation at DevOps Enterprise Summit London 2019 in June 2019 in London, UK by Holly Cummins

Slide 1

Slide 1

Tales from the DevOps Transformation Trenches yes, you (still) need to start with culture, not containers Austin Copenhagen Dubai London Madrid Melbourne Munich New York Nice Holly Cummins IBM Garage @holly_cummins Raleigh San Francisco São Paulo Singapore Tokyo Toronto

Slide 2

Slide 2

#IBMGarage @holly_cummins

Slide 3

Slide 3

#IBMGarage @holly_cummins

Slide 4

Slide 4

#IBMGarage @holly_cummins

Slide 5

Slide 5

#IBMGarage @holly_cummins

Slide 6

Slide 6

#IBMGarage @holly_cummins

Slide 7

Slide 7

#IBMGarage @holly_cummins

Slide 8

Slide 8

hi. i’m a consultant. #IBMGarage @holly_cummins

Slide 9

Slide 9

these are my scary stories #IBMGarage @holly_cummins

Slide 10

Slide 10

how to fail at devops #IBMGarage @holly_cummins

Slide 11

Slide 11

“this is our devops team” #IBMGarage @holly_cummins

Slide 12

Slide 12

“this is our devops team” “… last year we called them the build team.” #IBMGarage @holly_cummins

Slide 13

Slide 13

containers will not fix your broken devops culture 8

Slide 14

Slide 14

even kubernetes will not fix your broken devops culture 9

Slide 15

Slide 15

“we’re going too slowly. we need to get rid of COBOL and make microservices!” #IBMGarage @holly_cummins

Slide 16

Slide 16

“we’re going too slowly. we need to get rid of COBOL and make microservices!” “… but our release board only meets twice a year.” #IBMGarage @holly_cummins

Slide 17

Slide 17

https://hackernoon.com/8-devops-trends-to-be-aware-of-in-2019-b4232ac8f351 #IBMGarage @holly_cummins

Slide 18

Slide 18

https://hackernoon.com/8-devops-trends-to-be-aware-of-in-2019-b4232ac8f351 #IBMGarage @holly_cummins

Slide 19

Slide 19

“every time we change code, something breaks” #IBMGarage @holly_cummins

Slide 20

Slide 20

distributed monolith #IBMGarage @holly_cummins

Slide 21

Slide 21

distributed monolith but without compile-time checking #IBMGarage @holly_cummins

Slide 22

Slide 22

just because a system runs across 6 containers doesn’t mean it’s decoupled #IBMGarage @holly_cummins

Slide 23

Slide 23

#IBMGarage @holly_cummins

Slide 24

Slide 24

mars climate explorer #IBMGarage @holly_cummins

Slide 25

Slide 25

for clarity: this wasn’t a client of mine. other people’s trenches #IBMGarage @holly_cummins

Slide 26

Slide 26

Courtesy NASA/ JPL-Caltech #IBMGarage @holly_cummins

Slide 27

Slide 27

#IBMGarage @holly_cummins

Slide 28

Slide 28

#IBMGarage @holly_cummins

Slide 29

Slide 29

Slide 30

Slide 30

metric units

Slide 31

Slide 31

metric units imperial units

Slide 32

Slide 32

metric units imperial units distributing did not help

Slide 33

Slide 33

microservices need consumer-driven contract tests #IBMGarage @holly_cummins

Slide 34

Slide 34

Cluster + Ariane 5 $370 million loss https://en.wikipedia.org/wiki/Cluster_(spacecraft) #IBMGarage @holly_cummins

Slide 35

Slide 35

Cluster + Ariane 5 $370 million loss https://en.wikipedia.org/wiki/Cluster_(spacecraft) #IBMGarage @holly_cummins

Slide 36

Slide 36

#IBMGarage @holly_cummins

Slide 37

Slide 37

they tested it … #IBMGarage @holly_cummins

Slide 38

Slide 38

they tested it … but stubbed out one component. #IBMGarage @holly_cummins

Slide 39

Slide 39

they tested it … but stubbed out one component. that component was the one that broke. #IBMGarage @holly_cummins

Slide 40

Slide 40

“Had we done end-to-end testing, we believe this error would have been caught.” Arthur Stephenson Chief Investigator #IBMGarage @holly_cummins

Slide 41

Slide 41

microservices need automated integration tests #IBMGarage @holly_cummins

Slide 42

Slide 42

“we have a CI/CD” #IBMGarage @holly_cummins

Slide 43

Slide 43

CI/CD is something you do, it’s not a tool you buy #IBMGarage @holly_cummins

Slide 44

Slide 44

“i’ll merge my branch into our CI next week” #IBMGarage @holly_cummins

Slide 45

Slide 45

“CI/CD … CI/CD … CI/CD … we release every six months … CI/CD …. ” #IBMGarage @holly_cummins

Slide 46

Slide 46

continuous. I don’t think that word means what you think it means. #IBMGarage @holly_cummins

Slide 47

Slide 47

how often should you push to master? #IBMGarage @holly_cummins

Slide 48

Slide 48

how often should you push to master? integrate? #IBMGarage @holly_cummins

Slide 49

Slide 49

how often should you push to master? integrate? every character #IBMGarage @holly_cummins

Slide 50

Slide 50

how often should you push to master? integrate? every character actually continuous … but stupid #IBMGarage @holly_cummins

Slide 51

Slide 51

how often should you push to master? integrate? every character every commit (several times an hour) actually continuous … but stupid #IBMGarage @holly_cummins

Slide 52

Slide 52

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) actually continuous … but stupid #IBMGarage @holly_cummins

Slide 53

Slide 53

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) once a day actually continuous … but stupid #IBMGarage @holly_cummins

Slide 54

Slide 54

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) once a day once a week actually continuous … but stupid #IBMGarage @holly_cummins

Slide 55

Slide 55

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) once a day once a week once a month actually continuous … but stupid #IBMGarage @holly_cummins

Slide 56

Slide 56

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) once a day once a week once a month once every six months actually continuous … but stupid #IBMGarage @holly_cummins

Slide 57

Slide 57

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) once a day once a week once a month once every six months actually continuous … but stupid #IBMGarage trunk-based development @holly_cummins

Slide 58

Slide 58

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) once a day ok actually continuous … but stupid #IBMGarage once a week once a month once every six months trunk-based development @holly_cummins

Slide 59

Slide 59

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) once a day ok actually continuous … but stupid #IBMGarage bad once a week once a month once every six months trunk-based development @holly_cummins

Slide 60

Slide 60

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) once a day ok once a week once a month once every six months bad bad actually continuous … but stupid #IBMGarage trunk-based development @holly_cummins

Slide 61

Slide 61

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) once a day ok once a week once a month once every six months bad bad seriously? actually continuous … but stupid #IBMGarage trunk-based development @holly_cummins

Slide 62

Slide 62

how often should you push to master? integrate? every character every commit (several times an hour) every few commits (several times a day) once a day ok once a week once a month once every six months bad bad my favourite actually continuous … but stupid #IBMGarage seriously? trunk-based development @holly_cummins

Slide 63

Slide 63

how often should you release? every push (many times a day) every user story every epic once a sprint once a quarter #IBMGarage once every two years @holly_cummins

Slide 64

Slide 64

deploy? how often should you release? every push (many times a day) every user story every epic once a sprint once a quarter #IBMGarage once every two years @holly_cummins

Slide 65

Slide 65

deploy? how often should you release? every push (many times a day) every user story every epic once a sprint once a quarter once every two years (need a good handle on feature flags) #IBMGarage @holly_cummins

Slide 66

Slide 66

deploy? how often should you release? every push (many times a day) every user story every epic once a sprint once a quarter ok once every two years (need a good handle on feature flags) #IBMGarage @holly_cummins

Slide 67

Slide 67

deploy? how often should you release? every push (many times a day) every user story every epic once a sprint once a quarter ok (need a good handle on feature flags) #IBMGarage once every two years oldschool @holly_cummins

Slide 68

Slide 68

deploy? how often should you release? every push (many times a day) every user story every epic once a sprint once a quarter ok once every two years sigh (need a good handle on feature flags) #IBMGarage oldschool @holly_cummins

Slide 69

Slide 69

deploy? how often should you release? every push (many times a day) every user story every epic once a sprint once a quarter ok ok once every two years sigh (need a good handle on feature flags) #IBMGarage oldschool @holly_cummins

Slide 70

Slide 70

deploy? how often should you release? every push (many times a day) every user story every epic once a sprint once a quarter ok ok once every two years sigh hardcore (need a good handle on feature flags) #IBMGarage oldschool @holly_cummins

Slide 71

Slide 71

deploy? how often should you release? every push (many times a day) every user story every epic once a sprint once a quarter ok ok once every two years sigh hardcore (need a good handle on feature flags) #IBMGarage my favourite oldschool @holly_cummins

Slide 72

Slide 72

how often should you test in staging? #IBMGarage @holly_cummins

Slide 73

Slide 73

how often should you test in staging? deliver? #IBMGarage @holly_cummins

Slide 74

Slide 74

how often should you test in staging? deliver? every push my favourite #IBMGarage @holly_cummins

Slide 75

Slide 75

“we can’t actually release this.” #IBMGarage @holly_cummins

Slide 76

Slide 76

why? #IBMGarage @holly_cummins

Slide 77

Slide 77

what’s stopping more frequent deploys? #IBMGarage @holly_cummins

Slide 78

Slide 78

“we can’t release this microservice… we deploy all our microservices at the same time.” #IBMGarage @holly_cummins

Slide 79

Slide 79

“it looks like it’s complete … but nothing works if you click on it.” #IBMGarage @holly_cummins

Slide 80

Slide 80

front-end integration layer back-end #IBMGarage @holly_cummins

Slide 81

Slide 81

user story front-end integration layer back-end #IBMGarage @holly_cummins

Slide 82

Slide 82

user story #IBMGarage front-end frontend integration layer int. layer back-end backend @holly_cummins

Slide 83

Slide 83

user story front-end frontend integration layer back-end #IBMGarage @holly_cummins

Slide 84

Slide 84

user story front-end integration layer back-end #IBMGarage @holly_cummins

Slide 85

Slide 85

user story front-end integration layer back-end #IBMGarage backend @holly_cummins

Slide 86

Slide 86

user story front-end #IBMGarage integration layer int. layer back-end backend @holly_cummins

Slide 87

Slide 87

✓ it works by the time anyone sees it user story front-end frontend integration layer int. layer back-end backend stakeholders need to be careful what they incentivise #IBMGarage @holly_cummins

Slide 88

Slide 88

vertical slices #IBMGarage @holly_cummins

Slide 89

Slide 89

back-out development #IBMGarage @holly_cummins

Slide 90

Slide 90

back-out back-first development development #IBMGarage @holly_cummins

Slide 91

Slide 91

deferred wiring #IBMGarage @holly_cummins

Slide 92

Slide 92

feature flags #IBMGarage @holly_cummins

Slide 93

Slide 93

“we can’t ship until every feature is complete” #IBMGarage @holly_cummins

Slide 94

Slide 94

but why? #IBMGarage @holly_cummins

Slide 95

Slide 95

“users won’t find it compelling enough if we release now” #IBMGarage @holly_cummins

Slide 96

Slide 96

if you’re not embarrassed by your first release it was too late - Reid Hoffman #IBMGarage @holly_cummins

Slide 97

Slide 97

lean #IBMGarage @holly_cummins

Slide 98

Slide 98

“we only get one chance to get it right” #IBMGarage @holly_cummins

Slide 99

Slide 99

the ariadne failed in 36 seconds you can’t a/b test a $370 million rocket #IBMGarage @holly_cummins

Slide 100

Slide 100

we think we’re here one chance #IBMGarage @holly_cummins

Slide 101

Slide 101

we think we’re here one chance #IBMGarage brand damage @holly_cummins

Slide 102

Slide 102

we think we’re here market failure (indifference) one chance #IBMGarage brand damage @holly_cummins

Slide 103

Slide 103

we think we’re here market failure (indifference) continuous improvement delights growing user base one chance #IBMGarage brand damage @holly_cummins

Slide 104

Slide 104

we think we’re here market failure (indifference) a/b testing continuous improvement delights growing user base one chance #IBMGarage brand damage @holly_cummins

Slide 105

Slide 105

could we be here? we think we’re here market failure (indifference) a/b testing continuous improvement delights growing user base one chance #IBMGarage brand damage @holly_cummins

Slide 106

Slide 106

feedback is good engineering #IBMGarage @holly_cummins

Slide 107

Slide 107

they often couldn’t see the explorer #IBMGarage @holly_cummins

Slide 108

Slide 108

“but our change control process …” #IBMGarage @holly_cummins

Slide 109

Slide 109

“this provisioning software is broken” #IBMGarage @holly_cummins

Slide 110

Slide 110

10 minute provision-time what we sold “this provisioning software is broken” #IBMGarage @holly_cummins

Slide 111

Slide 111

what the client thought they’d got 10 minute provision-time what we sold 3 month provisiontime “this provisioning software is broken” #IBMGarage @holly_cummins

Slide 112

Slide 112

what the client thought they’d got 10 minute provision-time the reason 3 month provisiontime 84-step pre-approval process what we sold “this provisioning software is broken” #IBMGarage @holly_cummins

Slide 113

Slide 113

“we’ve scheduled the architecture board review for a month after the project ships” #IBMGarage @holly_cummins

Slide 114

Slide 114

does the process add value? #IBMGarage @holly_cummins

Slide 115

Slide 115

#IBMGarage @holly_cummins

Slide 116

Slide 116

navigators warned something was wrong #IBMGarage @holly_cummins

Slide 117

Slide 117

navigators warned something was wrong they didn’t fill in the right form #IBMGarage @holly_cummins

Slide 118

Slide 118

navigators warned something was wrong they didn’t fill in the right form so nothing was done #IBMGarage @holly_cummins

Slide 119

Slide 119

“we can’t ship until we have more confidence in the quality” #IBMGarage @holly_cummins

Slide 120

Slide 120

“we can’t ship until we have more confidence in the quality” you can fix that #IBMGarage @holly_cummins

Slide 121

Slide 121

“this is the test team … who don’t have the skills to automate their tests.” #IBMGarage @holly_cummins

Slide 122

Slide 122

“our tests aren’t automated” #IBMGarage @holly_cummins

Slide 123

Slide 123

“our tests aren’t automated” #IBMGarage @holly_cummins

Slide 124

Slide 124

“we don’t know if our code currently works” #IBMGarage @holly_cummins

Slide 125

Slide 125

“we don’t know if our code currently works” #IBMGarage @holly_cummins

Slide 126

Slide 126

“it costs too much to release” #IBMGarage @holly_cummins

Slide 127

Slide 127

“it costs too much to release”you can fix that #IBMGarage @holly_cummins

Slide 128

Slide 128

not a good CI/CD indicator a good CI/CD indicator “we don’t know when the build is broken” #IBMGarage @holly_cummins

Slide 129

Slide 129

get the pipeline status into the physical spaces #IBMGarage @holly_cummins

Slide 130

Slide 130

“only Bob can change Jenkins” #IBMGarage @holly_cummins

Slide 131

Slide 131

#IBMGarage @holly_cummins

Slide 132

Slide 132

“oh yes, that build has been broken for a few weeks…” #IBMGarage @holly_cummins

Slide 133

Slide 133

judge judge judge #IBMGarage @holly_cummins

Slide 134

Slide 134

modern devops toolchains and processes reflect cloud native apps and cultural transformation many, single-tenant toolchains #IBMGarage hybrid and multicloud toolchains and deployments toolchains support lean delivery processes and business agility @holly_cummins

Slide 135

Slide 135

heritage devops toolchains and processes reflect heritage apps and cultural inertia shared, multi-tenant toolchain “backbone” #IBMGarage on-premise automation tools release management and dependency coordination are hard @holly_cummins

Slide 136

Slide 136

“you’ll be coding on the mainframe” #IBMGarage @holly_cummins

Slide 137

Slide 137

#IBMGarage @holly_cummins

Slide 138

Slide 138

#IBMGarage @holly_cummins

Slide 139

Slide 139

this can get tiring #IBMGarage @holly_cummins

Slide 140

Slide 140

transformation endurance #IBMGarage @holly_cummins

Slide 141

Slide 141

remember the why #IBMGarage @holly_cummins

Slide 142

Slide 142

® @holly_cummins IBM Cloud Garage