Site performance optimizations: a look back at Election 2012
By: Peter Keung | November 20, 2012 | eZ Publish development tips, Site performance, and Case study
4 years ago, in 2008, we first met what is now one of our longest ongoing clients, Rasmussen Reports. There was some stress and urgency to their problem, as their site was crashing leading up to the most important time of the year for them -- the US presidential elections. We managed to stabilize their site and imagined a time in the distant future: a less stressful 2012 election period! The month leading up to November 6, 2012 turned out to be a record-breaking traffic month for a couple of our clients, including Rasmussen Reports. This time around, we had no website hiccups during an election season that saw more than twice as many visits in the peak month and an almost 3-fold spike in pageviews from the previous major election's 1-day peak.
Here are some of the site performance best practices we implemented in advance of the 2012 elections.
Of course, technology has advanced over these past few years. But we've also added much more content and more features, seen increased registered user numbers and activity, and saw a dramatic increase in traffic. And adding more hardware and upgrading to the latest versions of everything can only go so far, nevermind that you're often limited by budget.
Send fewer requests to the content management system
eZ Publish is the content management system that we use to edit, store, and generate pages. It excels in many areas, but solutions dedicated to serving already-generated pages are much better suited for large traffic spikes. On the peak traffic day on the eve of the election, we served over 2.6 million pageviews, at times approaching 150 pageviews per second on only 2 front-end web servers that hosted Varnish in front of eZ Publish. The server resource usage stayed in a very healthy range all day. Varnish is a reverse proxy whose basic function is to serve an entire HTML page if it's in the cache; if not, it passes the request through to eZ Publish. Using a content delivery network such as Akamai (who we use for another client) is a similar concept, although that also includes some extra features.
There are also ways to have repeat visitors request fewer assets from the server. For example, story images often do not change, and in eZ Publish, a given image URL remains unique to that image (because updating the image will give it a new URL). In such cases, we add an "Expires" header for far into the future for content images so that a regular visitor to the site only requests each image once.
Make the content management system more efficient at generating pages
You can't solve everything by shielding the content management system. You still need to generate dynamic pages if not just to update content but also to serve the dynamic needs of logged in users.
In eZ Publish, efficiency efforts often center around caching, and there are many layers of it. Mastering the concepts of view cache and template block cache will help you to build cache files and cache relations so that only the exact pages and page areas are cleared, and only when needed.
Beyond caching, you want to make sure that your server setup is optimal, and that includes the eZ Distributed File System (eZDFS) file handler. This is the best practice way to have multiple servers sharing the same data and binary assets. You can get a bit deeper in optimizing an eZ DFS setup, such as by serving the images from each web server.
There are some more obscure tips, and one of our favourites is a setting that needs to be enabled in eZ Publish's config.php file:
define( 'EZP_INI_FILEMTIME_CHECK', false );
By default, on every page load, eZ Publish checks the modified file time of INI files to see whether the INI file cache needs to be updated. You can turn off this check to force it to always use the cached files.
Use existing server resources more efficiently
eZ Publish always has to run within the context of the server, making use of Apache, MySQL, and PHP.
In Apache, turn on the KeepAlive setting so that the same connections can be re-used to serve many assets in the same page load. Also, tune the MaxClients setting so that the number of connections to the server cannot exceed what the server is capable of.
In PHP, make sure you are using a PHP accelerator such as APC.
In MySQL, the most important thing is often to optimize memory usage. The mysqltuner tool is invaluable at providing suggestions for exact settings to tune.
Send less data without sending less content to visitors
Compressing the files that you send to visitors decreases the outgoing bandwidth amount and helps to move on to the next page request!
You can also configured Apache to gzip your data via the mod_deflate module, essentially compressing your HTML by as much as 75%.
Similarly, using image sprites groups a lot of your "design" images into one image of a smaller total size, and has the added advantage of decreasing page rendering time. Also, an often overlooked fact is that images taken by cameras have metadata embedded in them that almost never serve a purpose on web pages. This metadata information can be stripped from all images in an image.ini.append.php setting:
Much more optimization!
Even on the server side, we could go into much more detail, but every site has its own specific needs, and the major categories discussed should provide a generally applicable outline for many different types of sites.
Here's to four more years of optimization tips and even bigger traffic spikes!