Resource Optimizer for PHP
There are many tools available to analyze performance of web pages, such as the popular YSlow tool from Yahoo!. In an effort to make it easy to perform some of the optimizations that these tools recommend, White Whale has created an open source ResourceOptimizer script for PHP, and we will soon be using it in our own CMS, LiveWhale.
What does ResourceOptimizer do?
The features that ResourceOptimizer implements include the following:
- Aggregate Javascript and CSS resources commonly packaged together into fewer individual resource requests. It is not uncommon for sites to include a handful of Javascript plugins (such as for jQuery), some scripts for a CMS or third party software package, and some scripts containing site or project specific code, as well as CSS rules organized into more than one file. All of these individual resources represent requests that have to be made to the server periodically. By aggregating these files together, it is possible to make fewer server requests and thereby improve performance across the entire volume of visits hitting your site. Requests are cleanly formatted and delivered over a basic REST implementation. Resource Optimizer is also smart enough to distinguish between different CSS media types in order to maintain separation and preserve conditional styling.
- Support CDN (content delivery networks) for Javascript, CSS, and images. When your browser loads a web page, there is a limit to the number of resources that it will request at once per host. By distributing the resources attached to a web page across multiple hosts, you can allow your browser to fetch more resources at once and make full use of available bandwidth. In doing so, the components of your web pages will load faster. Additionally, since Javascript and CSS resources hosted externally can be optimized and aggregated locally, slow-to-respond requests from resources on servers you do not control will not present a performance obstacle. (Relative urls in externally hosted CSS are updated automatically.) Hosts that are considered valid for CDN usage are hosts that A) point to the same docroot and B) support SSL/HTTPS if the main host does. In order to prevent one resource from being hosted on a random CDN across different contexts, and therefore potentially added to a browser's cache multiple times, ResourceOptimizer ensures that the same resource is assigned to the same CDN every time.
- Extreme cacheability. Aggregated resources sent from your server contain instructions telling browsers how long to cache items locally, and when to check to see if newer versions of those resources exist. ResourceOptimizer uses etags to ensure that your server does not send any data to users that they don't already have available in their cache. More importantly, it makes aggresive use of caching headers to instruct browsers not to check for resources again for a very, very long time. This equates to a vast reduction in individual requests to your server. ResourceOptimizer will instruct browsers not to check for months, in fact.
- Automatically expire resources when content changes. This is where the unique approach that ResourceOptimizer takes really shines. The problem with telling browsers not to check for updates to cached resources for months at a time comes when you make an update to a Javascript file or a CSS rule, and want your users to get the change immediately -- not months from now. Sometimes this prevents developers from setting expiration dates on resources for more than a week, to ensure that users will receive changes in the not-to-distant future. ResourceOptimizer solves this problem by making the url to a collection of aggregated resources unique, and changing it whenever any of the content it contains is modified. For this reason, it can safely not expire resources for long periods of time but ensure that a new version of the content is fetched automatically when changes are made.
- Minimization and gzipping of resources. By integrating Yahoo!'s YUI Compressor, ResourceOptimizer can automatically "minimize" Javascripts and CSS files to make them much smaller. When requests are made to your server for resources aggregated by ResourceOptimizer, the content that's transferred is compressed in order to be transferred much faster. In addition to minimization, the resources are also transferred using gzip compression, making them a fraction of the original size. All this means huge bandwidth savings for your server.
- Deferral of resource loads. When resources are loaded, it often blocks the browser from continuing to render the web page, resulting in a perceived slowness of loading. ResourceOptimizer ensures that optimized resources are requested toward the end of page load to prevent this kind of stall. This is done using the "defer" attribute when possible, or by physically moving the resource to a later load position. (According to best practices, scripts within the body of your page should not rely on earlier scripts to have finished loading before they execute. If they need those resources, they should execute after a DOM ready event to ensure deferred resources have loaded. Deferred loading can be disabled if absolutely necessary.)
- Intelligently ignore resources that should not be optimized. Since ResourceOptimizer does not require any instruction from a developer in order to apply its optimizations, it needs to know which resources it should not touch. It uses a series of rules to skip resources that should not be aggregated, and ensures that the original load order is maintained in the process. These exemption rules range from inline code, which must be assumed to be unique to the page, to resources designed to return randomized content with each request.
- High performance. ResourceOptimizer has been painstakingly performance-tuned in order to achieve the above goals without incurring a large overhead. While ResourceOptimizer is most effective when running behind a page caching mechanism (like our Smart Cache), it should also perform well for on-the-fly optimization. In order to do this, ResourceOptimizer employs three levels of caching to greatly reduce the work it needs to do to perform these optimizations.
- Cache cleaning & purging. ResourceOptimizer comes with simple commands to periodically clear unused cache entries when no longer referenced by web pages, as well as to force expiration of cached resources.
Is it fast?
A great deal of caching is involved in ResourceOptimizer to make it as fast as possible. There are three ways this is done, and all three are configurable.
- Caching of url requests. Every resource must be approved by a set of rules in order for ResourceOptimizer to consider it safe to optimize. Among these rules is an analysis of the content returned by the url specified by the resource. Internal requests for these resources are cached (for one hour, by default) so that subsequent tests of this nature only need to occur periodically. Also, once an url is first cached, subsequent re-caches of the same url will never force the end user to wait on revalidation. Expired resources will be revalidated asynchronously once a stale cache is detected.
- Caching of aggregated resources. Once resources are aggregated and minimized once, they do not need to be recreated until content changes. These aggregates are kept fresh as long as there are pages that reference them. Once they are no longer needed (by default, after not being referenced by a page for 30 days), they can safely be purged.
- Caching of document transformations (head cache). This is by far the most significant optimization. Once it is known what should be done to a document containing a specific set of resources that can be optimized, subsequent requests for documents with the same set of resources to be optimized take advantage of a cached instruction on how to transform the document immediately, without having to perform the resource optimization routine all over again. This mechanism caches for a short duration (by default, five minutes) which is long enough to provide significant performance benefits, but low enough that it can quickly pick up changes when url cache entries expire.
Get The Code
To install:
- Decompress the file and copy the "resource_optimizer" dir to your web site's document root. (You can rename this directory as you like, but if you do so, you must update the path in resource_optimizer/.htaccess.)
- Make resource_optimizer/cache writable by the web server.
- That's it! Just follow the instructions below to use ResourceOptimizer in your PHP script.
How To Use It
Using the ResourceOptimizer is simple:
$ro=new ResourceOptimizer('/web/relative/path/to/resource_optimizer'); // initialize ResourceOptimizer
$page_source=$ro->optimizeResources($page_source); // optimize resources
You may wish to run this code before caching a page, or run it on-the-fly within an output buffer. Additional parameters include:
$ro->url_cache_ttl=3600; // configure the url cache TTL
$ro->resource_cache_ttl=2592000; // configure the resource cache TTL
$ro->head_cache_ttl=300; // configure the head cache TTL
$ro->enable_deferred_load=false; // disable deferred loading (not recommended)
Make sure to also consider the following commands:
$ro->cleanCache(); // should be run periodically to purge old cache entries (for example, via a cron script)
$ro->purgeCache(); // clears all url and head cache entries, forcing them all to immediately recache (for debugging)
You may wish to use the following attributes on resources to control how ResourceOptimizer handles them:
<link href="/path/to/file.css" rel="stylesheet" type="text/css" data-no-aggregation="true"/>
// aggregate a specific resource but separate it from the previous aggregated resource:
<link href="/path/to/file.css" rel="stylesheet" type="text/css" data-aggregate-break="true"/>
Requirements: In order for ResourceOptimizer to parse the <head> element of your document, it must be able to be parsed as XHTML. The DOM extension for PHP is required for this. You must also have not disabled the "shell_exec" function in PHP (enabled by default). Optional minimization support requires availability of command-line Java on your server. When present, minimization will automatically become active.
Share It
Your Feedback Requested
You're free to use this code in any way you please, but I would like to hear back from the PHP community about this tool. Please give it a try, and then use the address below to provide feedback.
Contact Us
"Resource Optimizer" for PHP is written by Alexander Romanovich (alex@whitewhale.net).