How to determine your Elevated Response Period
@QuintessenceAnx
What support is needed
@QuintessenceAnx
Build or Buy ! :: Make or Buy ”
@QuintessenceAnx
@QuintessenceAnx
⛔
@QuintessenceAnx
v0 Architecture @QuintessenceAnx
Random Outage Graph
@QuintessenceAnx
What, When, Where
@QuintessenceAnx
Let’s Talk a Little About Resiliency Itself
@QuintessenceAnx
A resilient system is a system that is able to withstand adversity.
@QuintessenceAnx
Something is resilient if it is able to withstand adversity.
@QuintessenceAnx
What can this look like?
@QuintessenceAnx
Organizational Resilience can look like having the appropriate response structure(s) in place for IT systems, services, and users in the event of a latency or outage.
@QuintessenceAnx
(IT) System Resilience can look like an application not going down, and/or autoscaling, in response to increased traffic.
@QuintessenceAnx
Why is this important?
@QuintessenceAnx
@QuintessenceAnx
Response and Design
@QuintessenceAnx
@QuintessenceAnx
Resilient Response
@QuintessenceAnx
Resilient Response Checklist •
Define elevated response
•
Maximize experienced responders
•
Both primary and secondary
•
Do not design around resources you do not have
•
Minimize responder burnout
•
Clear handoff procedures
•
Clear ownership
•
Dedicated, clear, responder roles
•
Practiced response process
•
Validate responder access to tools and data
•
Updated documentation @QuintessenceAnx
Define elevated response
@QuintessenceAnx
Maximize Experienced Responders
@QuintessenceAnx
Do not design around resources you do not have
@QuintessenceAnx
Validate access to tools and data
@QuintessenceAnx
Updated documentation
@QuintessenceAnx
Resilient Response Checklist •
Define elevated response
•
Maximize experienced responders
•
Both primary and secondary
•
Do not design around resources you do not have
•
Minimize responder burnout
•
Clear handoff procedures
•
Clear ownership
•
Dedicated, clear, responder roles
•
Practiced response process
•
Validate responder access to tools and data
•
Updated documentation @QuintessenceAnx
@QuintessenceAnx
Resilient Design Checklist • Build, test, secure with scalability in mind
• Build, test, secure with humans in mind
•
Automate as much as is feasible
•
Keep documentation updated in pace of releases
• Build, test, secure with redundancy • Do not design around resources and/or failover in mind
• Build, test, secure with operator control in mind
• Build, test, secure with observability in mind
you do not have
•
Clear ownership
•
Who owns the service, writes the code, etc. @QuintessenceAnx
Do not design around resources you do not have
@QuintessenceAnx
Clear ownership
@QuintessenceAnx
Resilient Design Checklist • Build and test with scalability in mind • Build and test with humans in mind
•
Automate as much as is feasible
•
Keep documentation updated in pace of releases
•
Do not design around resources you do not have
•
Clear ownership
• Build and test with redundancy and/ or failover in mind
• Build and test with security in mind • Build and test with operator control in mind
• Build and test with observability in mind
•
Who owns the service, writes the code, etc. @QuintessenceAnx
Practice with Ice Cream ”
@QuintessenceAnx
@QuintessenceAnx
Understand the Business
@QuintessenceAnx
@QuintessenceAnx
Resilient Response: Questions to Ask •
What cannot go wrong?
•
What is at risk of going wrong?
•
What responses are needed in each situation?
•
Who is doing what step(s) in the response process(es)?
•
Are we in an Elevated Response Period?
•
And are separate considerations for that period defined? @QuintessenceAnx
Resilient System: Questions to Ask •
How do we prevent “what cannot go wrong”?
•
How do we mitigate risk for “what else can go wrong”?
•
How do we support our response process(es)?
•
How do we support our responders?
•
How does an elevated response period impact our system?
@QuintessenceAnx
Resiliency is not limited to IT systems and personnel
@QuintessenceAnx