Back To Front Performance

A presentation at Future of Web Apps London in October 2012 in London, UK by Drew McLellan

Slide 1

Slide 1

Back to Front Performance. FOWA/FOM London 2012 @drewm Good afternoon! There are times when performing too quickly is frowned upon. When sat in front of a computer screen, however, that rule no longer applies. I'm going to talk about website performance. This will be largely technology-agnostic, but it will be technical. There's no designery pencil twirling here. Those in black turtlenecks should feel free to head straight for their Audi TTs now. I'm going to talk for about 40 minutes. I'll try not to be dull. If you finish listening before I've finished talking, just sit tight and I'll soon catch up with you.

Slide 2

Slide 2

Slide 3

Slide 3

I will not be discussing the design of the iPod, or the infinite complexities of choosing jam. No, I'm talking about something far and away less interesting than jam. Granted, "less interesting than jam" isn't the most enticing pitch for your time, but that's good, because you're already thinking about how best to use your time. That's exactly what performance is all about. So this afternoon I'm talking about something that your users don't see. I know I said there'd be no designery pencil twirling, but this is as bad as it gets. Web performance is about how your site feels. This is about something that underpins all your e ! orts with design and user experience - how quickly your site responds.

Slide 4

Slide 4

I don’t know if you drive. It would take too long to find out from you each individually. Driving in rush hour tra " c in an unfamiliar town is stressful, as the result of a wrong turn could be tens of minutes queuing to correct your route. The cost of mistakes is high when it takes time to correct them. That puts pressure on the decisions you make. This is stressful, and not enjoyable. When there’s no tra " c, the cost of mistakes is removed, and we’re free to discover the joy of exploring somewhere new. The entire experience is defined by the ease with which we're able to move from point A to B.

Slide 5

Slide 5

That manifests in the e ! ectiveness of the design, in the information architecture, in the search technology, but underpinning all of that is the core response speed of the site. No matter how great your site is, if it's slow to respond, each choice the user makes has the potential to be a mistake. The entire user experience is undermined and the user can just feel stressed.

Slide 6

Slide 6

Performance is a feature. Performance isn't a problem to detect and fix, it's a feature to build in. It’s important to your users. It’s important to Google. It’s important to the success of your marketing. Performance benefits:

  • put o ! server investment
  • use less bandwidth
  • faster = better page rank in google
  • faster = more time on site That's what I'm talking about this morning.

Slide 7

Slide 7

This is Steve Souders. He’s a great guy. He writes and talks a lot about website performance, and everything he says is valid. I have a lot of respect for him. Steve works at Google, and before that he was in charge of performance at Yahoo. When researching how to increase site performance, he and his colleagues found that most gains were to be found by optimising the front end. Most of the slowness was in the browser. Most of what Steve writes and talks about is optimising your front end code. That’s all absolutely valid stu ! , and we’ll be looking at some of the techniques this afternoon. But this leads to a problem. If you’re an engineer at Google or Yahoo, you’re well aware of the constraints of operating at scale. They operate massive server farms, and a lot of the sites are already largely static. They’re optimised for good performance out of the gate, because at that scale, it’s one of your major constraining factors as an engineer. When looking at sites like that, it’s true that most of the work left to do is in the browser. So it’s easy to misunderstand, and walk away with the impression that the only place performance matters is in the browser.

Slide 8

Slide 8

The Steve Souders Problem. The same is not true for your shared hosting account running WordPress. It's not true for your VPS running your web app, unless you've planned for it. Front end performance is flat. Take out any backend performance and network issues, and the time it takes to render a page is not a ! ected by the number of people also rendering that same page at the same time. Backend performance changes dramatically under load. Big organisations like Yahoo and Google have large enough infranstructure that their backend performance doesn't change under load. This creates the perception that front end performance is all that matters. That's not the same for you. You want to make sure that your site continues to perform acceptably under load, and the way to do that is to make sure it's super fast when not under load.

Slide 9

Slide 9

Slide 10

Slide 10

256MB 10 MB <25

Slide 11

Slide 11

Performance under load depends on the time taken to process each request. The faster you deal with a request, the sooner the page gets sent to the browser, and the server is freed up for the next request. Unless your server is free to respond, none of your front end optimisation matters at all.

Slide 12

Slide 12

Slide 13

Slide 13

The Back End. Let’s start at the back end.

Slide 14

Slide 14

Hosting. It all starts with hosting.

Slide 15

Slide 15

Cheap hosting
is expensive. It frequently surprises me how little some designers and developers appear to care about the quality of their hosting. They’ll spend days, weeks, months crafting a site and then launch it onto $3 per month crappy shared hosting. It should go without saying that if you’re paying $3 per month for hosting, that hosting is going to be over-sold. Putting networked hardware in data centres, keeping it cooled, powered and sta ! ed costs quite a lot of money. Simple economics dictate that if you’re not paying very much money for that service, then the hosting company are going to have to make it up on volume. That means lots of customers per server – probably more customers per server than will be acceptable if you care about the response time of your website.

Slide 16

Slide 16

Shared hosting
is the worst. A reasonable rule of thumb is that shared hosting will not be fast. If you care about speed you need to think about a virtualised server (VPS-style, cloud or traditional) which has CPU and RAM resources reserved for it, not in contention with other customers. If you want more grunt, a dedicated server is a good option. Hosting packages should be constrained by the most expensive resource. That’s usually CPU, followed by RAM. The constraining resource should not be disk space. If a package limits harshly on disk space, it should be clear that their priority is packing accounts in, not providing good quality of service.

Slide 17

Slide 17

Consider what your project cost to build when buying hosting. The quality of your hosting should be commensurate with the quality of service you wish to provide to your customers. Low grade hosting will result in a low grade, or at least inconsistent experience for your customers. Unpredicatable slowness, resource scarcity, and lack of access to truly technical support are all reasons to keep away from the cheapest options. Nothing you do to optimise your app or the front end of your site will be able to make up for slow servers.

Slide 18

Slide 18

Use the best hosting you can afford, not the cheapest you can get away with. So use the best hosting that you can a ! ord, not the cheapest you can get away with. Remember that fancy marketing doesn’t always mean good hosting, so ask for recommendations from people who have similar requirements to your own. Look for recommendations from those who have been happy with their hosting provider for 5 years or more. It’s easy to run great hosting for a short while, but really di " cult to keep that going long term. A good VPS or a dedicated server will o ! er you the most options for reconfiguration, too. This is particularly useful when it comes to optimizing for performance, as we’ll come on to.

Slide 19

Slide 19

Hot Girls Love Hosting! It should also go without saying that if the only way a company can get you to buy their hosting is to use photos of scantily-clad ladies, it’s probably not very good hosting. Even if it is good hosting, I wouldn’t encourage you to give them your money.

Slide 20

Slide 20

Let’s talk a bit about the application layer. This is the part of the system where your code is run, where logic is peformed, where data is manipulated, stored and retrieved. Is the stu ! you write in Python, Ruby, Java, PHP. It’s your framework code and your CMS. It’s this layer that takes the time, and determines how long a request needs to process. If you’re performance minded, this is the place to make sure that your code takes as little time and e ! ort to run as possible. You need to consider the cost of all external resources you consume. Going to the database takes time. Accessing external APIs takes a lot of time. Everything you do has a cost - usually a necessary one - but sometimes there are places where those costs can be reduced or even removed. Let’s look at how that can be done.

Slide 21

Slide 21

We sometimes hear of the DRY coding principal. Don’t Repeat Yourself (or Single Source of Truth) encourages programmers to avoid repetition in their software, making sure that everything within a system is only has a single representation in code. It encourages modularity and makes maintenance simpler. There’s EVERY reason the same principal should also apply to THE WORK carried out by your app. Don't do the same unit of work twice if you expect the outcome to be the same. Do it once, cache the result, and reuse it. Only do the work again when the result is likely to be di ! erent, e.g. a di ! erent user, or significant time has passed.

Slide 22

Slide 22

Let’s look at an example. Say you had an aircraft carrier... with a cow on it. and all you have to feed the cow is some corn... and a chicken. No, that’s a bad example.

Slide 23

Slide 23

Here’s a better example. This is the blog section of a site we worked on for Greenbelt Festival earlier this year. As well as the main listing of posts, there are quite a few sidebar boxes which appear repeatedly throughout the section.

Slide 24

Slide 24

SELECT c.catTitle, COUNT(c2p.catID) as qty FROM tblBlogCategories c,
tblBlogPost_Categories c2p WHERE c.catID=c2p.catID GROUP BY c.catID ORDER BY qty DESC One of those is the list of post categories, which features a count of the number of posts in each category. It’s not massively expensive to run - here’s the SQL - but it does involve a temporary table, and will get slower as more content is added over time. But the key question is this: how often does that list change? How often does a new category get added, or a post to a category? Is it likely to change from one page request to the next? The answer, of course, is that it rarely changes. The post count almost NEVER changes compared with the number of page views. Yet we’re doing the work to calculate it every single time.

Slide 25

Slide 25

Be smart. So what’s the smart way to do this? You have to ask - when does the data change? a) When a category is added or deleted b) When a post is assigned to a category Both those events are completely within the control of our app. We don’t need to count the number of posts when the page loads, we can do it when a category or post is edited, and just store the results. In database design terms, we call this denormalisation. We’re allowing the database to hold some repeated data - less purity, but better peformance It enables us to run a simpler query that runs twice as fast immediately, and shouldn’t slow down as more posts are added.

Slide 26

Slide 26

  1. Run (faster) query
  2. Process through template
  3. Output to page

_ Fetch from cache. Our execution plan now looks a bit like this. But there’s a further optimization that can be made here. We now know that the result of the query, albeit faster, is not going to change very often. So why do we keep running it? These first two steps are being run over and over again, doing the same work when we already know the result. What we should do instead, is run the query once an hour or so, save the templated result to cache, and then just display that for the next 60 minutes. Now it’s true that the information could change at some point within that time frame. It could be that a category post count is o ! by one for up to an hour.

Slide 27

Slide 27

How real-time is this data? So as yourself how critical it is that the data you’re displaying is up-to-date. Is anyone using it as the basis of important decisions? Or does it matter if it lags a little? 
 For category counts, I’m not sure it matters at all. The purpose of the count is to give a general feel of the ‘popularity’ of a category. The specific number doesn’t matter, and can certainly be wrong for an hour. So let’s cache the hell out of it. Remember, avoid doing work over again unless you have the reasonable expectation that the result could be di ! erent.

Slide 28

Slide 28

Do the work once. This principal applies to all sorts of things throughout an app. It’s obvious when you think about it, but far too many developers either don’t thing about it or are scared of being accused of premature optimisation.

Slide 29

Slide 29

Premature optimization is the root of all evil. KANOOTH "Premature optimisation is the root of all evil" (Donald E. Knuth) has caught on because it's dramatic and nerds can put it on a t-shirt. The full quote gives more context

Slide 30

Slide 30

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered.
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Slide 31

Slide 31

Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. Remember, that quote is originally from 1974. In 1974 computers were slow and computing power was expensive, which gave some developers a tendency to overoptimize, line-by-line. I think that's what Knuth was pushing against. He wasn't saying "don't worry about performance at all", because in 1974 that would just be crazy talk. Knuth was explaining how to optimize; in short, one should focus only on the bottlenecks, and before you do that you must perform measurements to find the bottlenecks.

Slide 32

Slide 32

The fear of premature optimization is the root of all evil. Premature optimisation is not to be confused with early optimisation. It is not an excuse to never consider the cost of resources being used. There is never an excuse for that. "Premature" means that the optimisation is done before a problem is identified using proper measurements. EXPERIENCE helps us identify problems BEFORE they occur. If you know something is going to be a bottleneck, then the optimisation isn't premature - it's just sensible. e.g. you know browsers limit requests per domain, so if you know you’ll be using a lot of assets, architecting a system that can simple layer itself across multiple domains is not premature. Failing to consider resource cost will result in code that then needs to be optimised. Retro-optimising is time consuming and never as e ! ective and designing systems around performance. Micro-optimisations (e.g. which type of loop?) are the evil kind. It's not evil to consider e.g. how many requests to a web service API you're making. To defend yourself against accusations of premature optimisation, you practically have to actively choose the worst way to do everything all the time.

Slide 33

Slide 33

Some modern frameworks focus on a concept of convention over configuration. The idea is that instead of configuring your settings, you can follow some basic conventions that mean that no configuration is required. It aims to reduce the number of decisions a developer needs to make, resulting in applications that are easier to build. Principle examples include database conventions for table and column names - instead of configuring table names, the framework can figure it out based on the name of your models. Similarly, your models, views and controllers can all be automatically associated by putting the files in the right place and following the convention for how the are named.

Slide 34

Slide 34

Let’s make programming fun! The concept is that instead of having the programmer go to the e ! ort of specifying a setting, you can have the framework figure it out for itself. This is supposed to make development quicker and more enjoyable by having the software take care of small tasks like figuring out where to find things in the database and file system. The trouble is, if the framework designer is not careful, this can lead to very non-DRY code. Commonly, the work to figure out the result of the convention is done at runtime, for every request. The code is constantly repeating itself, figuring out the same thing over and over again, when the answer is known and won’t change. Rather than having the programmer work out what the result of the convention is and store it once in a configuration file, the framework calculates the result over and over and over again, burning CPU cycles, and wasting everyone’s time. Is the result going to change from one request to the next? If not, stop figuring it out and just configure it. Does the type of database change from one request to the next? Do your table names change? No. And if they do, it’s not going to happen between requests. You’ll know and can update the configuration.

Slide 35

Slide 35

Insanity is doing the same thing over and over again
and expecting
different results. Albert Einstein said that insanity is doing the same thing over again and expecting di ! erent results. Convention over configuration fits this description perfectly. From a performance point of view, it’s insanity.

Slide 36

Slide 36

The Network. Here’s some tips on how to optimize when considering the network layer of your app.

Slide 37

Slide 37

www.varnish-cache.org I’ve become a massive fan of Varnish of late. It’s an HTTP cache (or reverse proxy) that sits on port 80 in front of your web server. If the web server’s response is cachable, it keeps a copy in memory (by default) and serves it up the next time that same page is requested. Done right, it can dramatically reduce the number of requests hitting your backend web server, whilst serving precompiled pages super-fast from memory. Good use of Varnish can make your site much faster, however, it is no silver bullet. The caveat “if the web server’s response is cachable” turns out to be a very important one. You really need to design your site from the ground up to use a front end cache in order to make the best use of it. As soon as you’ve identified the user with a cookie (including something like a PHP session, which of course uses cookies) then the request will hit your backend web server. Unless configured otherwise that would include things like Google Analytics cookies, which of course, would be every request from any JavaScript-enabled browser. If you static assets (images, CSS, JavaScript) from the same domain, by default the cache will be blown on those, too, as soon as a cookie is set. So you have to design for that. So while Varnish will help to take the load and shorten response times on common pages like your site’s front page, you can’t rely on it as an end-all solution for speeding up a slow site. If your backend app is slow, your site will still be slow for a lot of requests. It’s a bit like putting WP Super Cache on a WordPress site. It will mask the underlying issue to an extent, but it won’t solve the underlying problem.

Slide 38

Slide 38

Divide resources across domain names. You may or may not be aware that a web browser will limit the number of simultaneous connections it makes to a server based on domain name. The limit has been increased in recent browsers, but for older browsers, that limit was often 2 connections.

Slide 39

Slide 39

s1.example.com s1.example.com s2.example.com s2.example.com s3.example.com s3.example.com s4.example.com s4.example.com So if you page requests, say, lots of images, the browser will request two, wait for those to finish loading, then request the next two, and so on. This limit is designed to protect both the user’s network connection from saturation, and the web server from over-demand. However, it was set at a time when the user’s connection would have been dial-up and web servers were made from cheese. Faster connections and more capacious servers have made this an artificial limitation in the modern age. But it can be easily subverted. By spreading your requests across multiple sub-domains, you can get around the limitation, even for those older browsers. Of course, each extra subdomain requires a DNS lookup, so the sweet spot tends to be somewhere around 4-5 di ! erent domains, but no more.

Slide 40

Slide 40

It’s easy to forget that every cookie you set adds weight to all the following HTTP requests the user makes. We think of cookies as small text files stored in the user’s browser - but that’s a distraction. More practically, it’s a string of text sent back and forth with every request, making every request heavier. If you’re not careful with how many cookies you set, they can begin to become a performance issue. The other consideration is, as we’ve seen, cookies bust server side caches like Varnish. If the cookies you set don’t a ! ect the response from the server - like those cookies only used by client-side JavaScript - remember to configure your server-side cache to ignore them.

Slide 41

Slide 41

One of the best things you can do to improve the performance of your site is to get rid of some tra " c. A content delivery network or CDN is essentially a service that provides a network of servers onto which you can put your static assets, for example. The big advantage of a CDN is that the servers are geographically distributed, often around the globe, and the network configured to serve files from the node closest to the user requesting the files. This means that instead of, for example, a user in the States needing to load images from your server in the UK, they could be getting a copy from servers in New York, or San Francisco. The physical network latency of that request is lower, so the user gets the file faster. The other advantage, of course, is that if the user is getting the file from the CND, the request is being handled there instead of on your server. This frees your hardware up for serving more pages. Some of the big CDNs like Akamai are enormous and enterprisey, but there are more a ! ordable solutions with things like Amazon CloudFront. The tactic of shifting your tra " c out to other places is a good one, provided the service you use is reliable. Just 10,000 subscribers polling an RSS feed once an hour results in a quarter of a million requests a day - tra " c which could perhaps be shifted over to FeedBurner, for example.

Slide 42

Slide 42

There’s not much you can do about a user needing to download all the assets your page links to, but you can lessen the impact on subsequent page views by making sure that those assets get cached. This is where having good control over your server is a big benefit - you can configure the HTTP response headers to suit your needs.

Slide 43

Slide 43

ExpiresActive on ExpiresByType image/jpg "access plus 7 days" ExpiresByType image/gif "access plus 7 days" ExpiresByType image/jpeg "access plus 7 days" ExpiresByType image/png "access plus 7 days" ExpiresByType text/css "access plus 7 days" ExpiresByType application/javascript
"access plus 7 days" ExpiresDefault "access plus 10 years" By setting Expires headers on key resources, you can instruct the browser to cache the asset for a set time. This example is saying files of these types should be cached for 7 days. But that could be 14 days, a month, two months. If your application makes provision for the renaming of assets when they’re updated, you can set a far-future Expires header for several years ahead, hopefully meaning that the asset will only be downloaded once per visitor. When the asset changes, give it a new name, and it will be redownloaded.

Slide 44

Slide 44

In a similar way, if you have control over your server configuration, you can make sure that content is GZIPed as it’s sent. Most browsers are capable of receiving zipped content, and as zipping works really well for text like HTML, CSS and JavaScript, you can send far smaller files which transfer faster.

Slide 45

Slide 45

AddOutputFilterByType DEFLATE
text/html
text/plain
text/xml
text/css
text/javascript
application/javascript www.whatsmyip.org/http-compression-test/ Configuring Apache’s mod_deflate, for example, is really easy to do, and can have immediate benefit. There shouldn’t be any need to zip images, as these should already be compressed. Attempting to zip them usually only wastes a bit of CPU. Sites like whatsmyip.org will do a quick test against your server, and not only tell you if gzip is working, but what saving you made.

Slide 46

Slide 46

The Front End. Thanks to the good work of people like Steve Souders, a lot has been written about front end performance. I won’t rehash it here, but rather highlight what I think are the most important points, and show you where to look for the rest.

Slide 47

Slide 47

Probably the biggest single thing you can do to improve the performance of your page when it comes to JavaScript is to move your JavaScript to the bottom of the page. As JavaScript files can include statements that manipulate the page, and because as a scripting language, things need to be run in a predictable order the programmer can control, when a JavaScript file is discovered, all other processing is stopped until the script is downloaded and executed. This means JavaScript e ! ectively blocks the loading of the rest of your page. If that happens in the <head> section, the block will occur before any content has been rendered, and the user will be left looking at a blank page, which is something of a buzz-kill. Plus, most JavaScript doesn’t actually do anything until the onload or ondomready event has been fired, so loading it up-front is a waste. If you shift your scripts to right before the closing </body> tag, the user at least has a page to begin reading and interacting with while your scripts load in. The /perceived/ speed overall is much faster, even if the total time to load is the same. When it comes to front-end performance, it turns out perception of speed is almost the most important thing.

Slide 48

Slide 48

github/ded/script.js/ stevesouders.com/controljs/ developers.google.com/loader/ My next suggestion is to use a JavaScript loader. This is a script which does two main things Firstly it loads your JavaScript asynchronously, avoiding blocking problems Secondly, it helps you manage script dependancies, making sure for example, jQuery gets loaded before a jQuery plugin, which gets loaded before your code which uses the plugin. And if the plugin fails to load, it won’t try to run your script, because it knows that will fail too. Good examples are: Script.js by Dustin Diaz ControlJS by Steve Souders Google Loader and countless others

Slide 49

Slide 49

You can talk about front end performance without mentioning Google PageSpeed and Yahoo’s YSlow. These are both browser extensions which analyse page performance, grade it, and o ! er ways to improve.

Slide 50

Slide 50

PageSpeed at the top, running in Chrome YSlow at the bottom, running in Firefox. These are both invaluable tools, because they analyze your site and make specific recommendations based on the findings. You can address each one in turn, re-run it and see your score improve.

Slide 51

Slide 51

But you need to be careful with some of the recommendations. You can’t follow them blindly - especially if you’re practicing Responsive Web Design. For responsive sites, some of the ‘rules’ just don’t apply in the same way. For example, if you look again at that PageSpeed screenshot you’ll see that the High Priority recommendation is to serve images that are already scaled to the right size for the page. It’s complaining that my images are being rescaled on the page. That’s because the site in question is responsive, and so yes, I’m deliberately scaling images down a bit for some window widths. One other related complaint is that I’m not always specifying image dimensions. That’s right - it’s part of the same issue. There’s also a number of graphics that are scaled for hiDPI screens like the retina display. PageSpeed sees those as needing optimisation. Both these tools also only look at one page, when some optimisations are designed on the assumption that a user will be visiting more than one page - which is common with a web app. The recommendation in ‘inline’ a small CSS file makes sense for this one page in isolation, but no longer makes sense once that CSS file is reused across multiple pages. So use these tools - they’re great. But remember that a perfect score or grade isn’t necessarily in your interests.

Slide 52

Slide 52

  • decknetwork.net 24ways.org/201106 My last recommendation isn’t something you’ll find mentioned by PageSpeed or YSlow, but is nether the less a practical issue. If your site loads in scripts from external sources, such as ads from a network, social media sharing widgets or embedded videos from a video sharing site, make sure to loads those in last. Certainly right at the bottom of your page, and ideally after all your own script has run. If they need to be inserted at points higher up the page insert them at the bottom and then move them up with DOM manipulation. You can’t trust those external sources to be responsive and load quickly. If you accept their default code in the middle of your page, it will block loading, and if their network is slow, it will make your site slow. External ads and widgets are the quickest route to slowness, so put them right at the end to make sure your page has loaded and is usable before they kick in.

Slide 53

Slide 53

Steve Souders + Books stevesouders.com Yahoo! Exceptional Performance developer.yahoo.com/performance/

Slide 54

Slide 54

The Back End.

Slide 55

Slide 55

The Front End.

Slide 56

Slide 56

The End. Drew McLellan allinthehead.com / @drewm Thanks, FOWA!