Sensory Friendly Monitoring: Keeping the Noise Down

A presentation at CloudOps Summit in July 2020 in by Quintessence Anx

Slide 1

Slide 1

SENSORY FRIENDLY MONITORING Keeping the noise down QuintessenceAnx

Slide 2

Slide 2

@QuintessenceAnx /@PagerDuty When we try to know everything…

Slide 3

Slide 3

@QuintessenceAnx /@PagerDuty Too much noise can… …bury important / high severity alerts in a sea of low priority notices …causing engineering teams to start muting alarms or whole alarm sources …which in turn means the people who need to be notified, won’t be.

Slide 4

Slide 4

@QuintessenceAnx /@PagerDuty Meanwhile, when we turn the dial too far…

Slide 5

Slide 5

@QuintessenceAnx /@PagerDuty Let’s find a happy medium. All alerts are fictional.

Slide 6

Slide 6

@QuintessenceAnx /@PagerDuty Consider: the cost of noise

Slide 7

Slide 7

@QuintessenceAnx /@PagerDuty Your brain on alerts Base image credit: Dreamstime

Slide 8

Slide 8

@QuintessenceAnx /@PagerDuty Time cost? ~25 minutes

Slide 9

Slide 9

@QuintessenceAnx /@PagerDuty Quality cost Source: Mo Selim Art Mo Selim Art Speed Challenge

Slide 10

Slide 10

@QuintessenceAnx /@PagerDuty Cost of multitasking Image credit: pngtree

Slide 11

Slide 11

@QuintessenceAnx /@PagerDuty So how to reduce the noise?

Slide 12

Slide 12

@QuintessenceAnx /@PagerDuty Be aware, not overwhelmed Determine the sources of noise Categorize the types of noise Channel the noise into a productive workflow Create a routine to clear the clutter

Slide 13

Slide 13

@QuintessenceAnx /@PagerDuty Sources of noise

Slide 14

Slide 14

@QuintessenceAnx /@PagerDuty Wait, I need to be aware of myself? Absolutely. All alerts are fictional.

Slide 15

Slide 15

@QuintessenceAnx /@PagerDuty How often do you… …check your email? …check your social media? …check your text messages? …check your Apple / Google messages? … the list goes on. All alerts are fictional.

Slide 16

Slide 16

@QuintessenceAnx /@PagerDuty Communication & Boundaries Plan for set times to focus on your work and mute non-critical alerts This includes messages from friends & family When setting boundaries make sure your friends, family, and co-workers know what you consider to be relevant emergencies Set reasonable expectations for yourself and others

Slide 17

Slide 17

@QuintessenceAnx /@PagerDuty But what about external sources of noise? All alerts are fictional.

Slide 18

Slide 18

@QuintessenceAnx /@PagerDuty Start categorizing your noise False positives False negatives Fragility Frequency (just fix it)

Slide 19

Slide 19

@QuintessenceAnx /@PagerDuty Save time: create your noise flow What needs to be known Who needs to know it How soon should they know How should they be notified

Slide 20

Slide 20

@QuintessenceAnx /@PagerDuty Re-Evaluate Redundancy Know How to Add a Little Complexity to Stop a Vacuum a.k.a. A bad day in ChatOps

Slide 21

Slide 21

@QuintessenceAnx /@PagerDuty Resilient noise builds trust How reliable are your tools and services? How much notification duplication is needed? Do you have the ability to switch alert endpoints in the event of a service outage? Do you regularly evaluate the reliability of your services (external and internal)? All alerts are fictional.

Slide 22

Slide 22

@QuintessenceAnx /@PagerDuty Keep alerts relevant: Sprint Cleaning For every alert triggered, ask: Was the notification needed? How was the incident resolved? Can the solution be automated? Is the solution permanent? How urgently was a solution needed? Photo by James Pond on Unsplash

Slide 23

Slide 23

@QuintessenceAnx /@PagerDuty Slides & Additional Resources Available on Notist https://noti.st/quintessence

Slide 24

Slide 24

Thank you! QuintessenceAnx Developer Advocate 🥑 @ https://noti.st/quintessence All alerts are fictional.

Slide 25

Slide 25

FIN.

Slide 26

Slide 26

SENSORY FRIENDLY MONITORING Keeping the noise down QuintessenceAnx

Slide 27

Slide 27

@QuintessenceAnx /@PagerDuty Re-Evaluate Redundancy Know How to Add a Little Complexity to Stop a Vacuum a.k.a. A bad day in SlackOps (Sorry Slack.)

Slide 28

Slide 28

Additional Reading @QuintessenceAnx /@PagerDuty “The Cost of Interrupted Work: More Speed and Stress” — Gloria Mark, dept of Informatics @ UC Irvine https://www.ics.uci.edu/~gmark/chi08-mark.pdf “Are digital distractions harming labour productivity?” — The Economist https://www.economist.com/finance-and-economics/2017/12/07/ are-digital-distractions-harming-labour-productivity “Brief Interruptions Spawn Errors” — Michigan State University https://msutoday.msu.edu/news/2013/brief-interruptions-spawn-errors/ “Tenets of SRE” — Stephen Thorne, Sr Google SRE https://medium.com/@jerub/tenets-of-sre-8af6238ae8a8