Don’t Panic! How to launch a large-scale website confidently and successfully Photo by SpaceX on Unsplash DevOps Tallinn 2018

Who am I? @ryantownsend Ryan Townsend, CTO

Relaunched May 2017

“Just use auto-scaling and forget about it” Kris Quigley – Lead Developer @ SHIFT (sarcasm)

Timeline Development Pre-launch Launch ! Post-launch

• Functional Testing • Deployment Pipelines • Configuration & Implementation

Development http://www.spacex.com/media-gallery/detail/149431/9391

Keep Things Simple

Limit Project Scope

New Problem or New Technology

“A l m o s t a l l t h e c a s e s w h e r e I ' v e h e a r d o f a
system that was built as a microservice system from scratch, it has ended up in serious trouble.” – Martin Fowler, ThoughtWorks CTO

Clear Decoupling

Admin Panel API Website

Use Boring Mature Technology

Load Testing

Don’t wait until the end

It’s A LOT harder than people let on

• Use real metrics and logged user behaviour • Use a wide variety of metrics, not just traffic • Post-test validate the metrics at source

Assume user behaviour will change

Stress Test

Web Performance Testing

Remember: it’s not just for you!

Caching

Client CDN Application Database

Write-through caches

Start small… low TTLs

Front-end – static assets & redirects

Higher hit ratios = less traffic hitting our servers

Feature Toggles

Ideal Fallback Off On

On Ideal Fallback Off

• Built into your application • Content Delivery Network • A/B testing tool

Circuit Breakers

Ideal Fallback Open Error Closed

Ideal Fallback Open Error Closed

Ideal Fallback Open Error Closed

Pre-launch Preparations https://www.flickr.com/photos/spacex/31450835954/

Communication

• Build a trusting relationship with stakeholders • Understand their metrics • Get their perspective • Determine authority

Visibility

• System monitoring 
 – infrastructure & client-side

• Client / stakeholder dashboards & reporting 
 – see what they see

• Customer engagement 
 – social media, customer support

• Instant access to logs 
 – filterable, searchable

Above shows how New Relic tracked a 3rd party script harming site performance but the server-side was fine.

Roleplay

• What could go wrong? • Who would you escalate to? • How would you solve? • What people do you need access to? • What systems do you need access to?

Traffic Reduction

• Avoid scheduling big campaigns • Paid advertising is easy to turn off • Reduce offering

Launch Day https://unsplash.com/photos/yJv97tE7GDM

Scale-up

“Big Bang” vs Canary Release

Feature Toggles: Off

Keep Calm and Carry On

• Expect issues • Keep a level-head • Remain professional • You’re an expert – you’ve got this "

Post-launch https://unsplash.com/photos/-p-KCm6xB9I

Continue Building Confidence

• Gather actual real metrics & usage patterns • Revisit your load tests and re-assess • Re-run load tests for future releases • Ship some safe releases • Ship small releases, often

Since Launch https://unsplash.com/photos/MEW1f-yu2KI

Optimising Caching

Strong Migrations

Started working towards micro macro-services

Event Sourcing

Static Site Generation

Communication is Paramount

Thank you

@ryantownsend