Oncall for developers

A presentation at DevOpsDays Atlanta in April 2016 in Atlanta, GA, USA by Leon Fayer

Slide 1

Slide 1

N CALL FOR DEVELOPERS @papa_fire

Slide 2

Slide 2

HELLO MY NAME IS LEON and I am a developer

Slide 3

Slide 3

user:/$~ sudo -s bash: Permission denied user is not in the sudoers file. This incident will be reported.

Slide 4

Slide 4

F&*# YEAH! user:/$~ sudo -s root:/#~

Slide 5

Slide 5

WITH GREATPOWER COMES GREATRESPONSIBILITY (and more work)

Slide 6

Slide 6

Should This Is Where Developers Your Awesome Be TITLE G OES On Call?

Slide 7

Slide 7

security Things that can go wrong hardware network application performance process

Slide 8

Slide 8

ONLY ONE HAS TO SUFFER alert escalation resolution

Slide 9

Slide 9

ACTIONABLE ALERTS 1 do I care? 2 can I fix it? 3 can I fix it tomorrow?

Slide 10

Slide 10

ACTIONABLE ALERTS 1 do I care? 2 can I fix it? 3 can I fix it tomorrow? 4 can someone else fix it?

Slide 11

Slide 11

…AND?

Slide 12

Slide 12

Create An documentation Epic Slideshare documentation With This documentation TEMPLATE

Slide 13

Slide 13

NO SAY TO UNDOCUMENTED ALERTS

Slide 14

Slide 14

DEEP INSTRUMENTATION top-down approach 1 understand business 2 monitor business 3 correlate data

Slide 15

Slide 15

network latency conversions database load CPU load cache hit ratio API responsiveness revenue email bounce rate performance MONITOR EVERYTHING - ALERT ON WHAT’S IMPORTANT

Slide 16

Slide 16

CONSTANT INSTRUMENTATION monitoring is NOT a feature

Slide 17

Slide 17

CONTINUOUS IMPROVEMENT

Slide 18

Slide 18

which one? availability (determine the need) (deploys, special events) AVAILABILITY

Slide 19

Slide 19

BEA GOODCITIZEN

Slide 20

Slide 20

@papa_fire