Serve Static Drupal Content Faster With
Boost And Nginx

By Stephen Jayna, 23rd December 2009

For Drupal sites that receive a not insignificant amount of anonymous traffic Boost is for you. Following on from yesterday's article on XCache where we went from 49 to 132 requests per second, we'll show you how Boost has taken us to an eye-popping 2516 requests per second for static Drupal content.

While it won't benefit everyone, there are a staggering number of Drupal based sites out there that serve predominately anonymous content. If you fall into this category you could do worse than consider adding Boost to your architecture.


Static vs. Dynamic Content

First things first. We need to define exactly what we mean by static Drupal content or anonymous traffic. Essentially it's content that remains the same no matter who is looking at it. Facebook, for example, is a site where each and every page is tailored for you: it's very dynamic. Conversely The Times is almost (as far as I can tell) entirely static — adverts withstanding — and would be a prime candidate for Boost.

How Does Boost Work?

Boost is a module that replaces Drupal's in-built anonymous page caching. When a page is generated by Drupal it is written by Boost to the file system. This allows your web server to serve a static file (if it's available) instead of invoking PHP. Take a look below at Everita's cache:

root@everita:/var/www/drupal/6/drupal/cache# pwd
/var/www/drupal/6/drupal/cache

root@everita:/var/www/drupal/6/drupal/cache# find
.
./perm
./perm/www.everita.com
./perm/www.everita.com/.boost
./perm/www.everita.com/sites
./perm/www.everita.com/sites/everita.com
./perm/www.everita.com/sites/everita.com/files
./perm/www.everita.com/sites/everita.com/files/css
./perm/www.everita.com/sites/everita.com/files/js
./perm/www.everita.com/javascript
./perm/boost-gzip-cookie-test.html.gz
./.boost
./normal
./normal/www.everita.com
./normal/www.everita.com/.boost
./normal/www.everita.com/test_.html
./normal/www.everita.com/_.html
./normal/www.everita.com/contact-everita_.html
./normal/www.everita.com/access-denied_.html
./normal/www.everita.com/bookshelf_.html
./normal/www.everita.com/about-everita_.html
./normal/www.everita.com/page-not-found_.html
./normal/www.everita.com/search-results_.html
./normal/www.everita.com/pixel-portraits-facial-recognition-opencv_.html
./normal/www.everita.com/how-the-newton-virus-was-made_.html
./normal/www.everita.com/subversive-sightseeing-interactive-video-telescopes-bu0836_.html
./normal/www.everita.com/thank-you-for-contacting-us_.html
./normal/www.everita.com/unodb-documentation_.html
./normal/www.everita.com/thank-you-for-your-request_.html
./normal/www.everita.com/comment
./normal/www.everita.com/comment/reply
./normal/www.everita.com/iphone-app-and-mobile-phone-development_.html
./normal/www.everita.com/mysql-lamp-and-drupal-services-from-everita_.html
./normal/www.everita.com/lightwave-collada-and-opengles-on-the-iphone_.html
./normal/www.everita.com/software-design-and-development-in-oxford-and-reading_.html

What you can see above is a static version — ready to serve — of almost every page in the Everita website. Be warned that you must use Clean URLs for Boost to work.

What sets Boost apart how well it is integrated into Drupal compared to something like Varnish. One rather excellent feature is that it knows what pages exist in the site and will crawl them thus warming the cache for you. This gets around the problem of one user having endure a tedious delay while the page is made for the first time.

Time Is Money

This is very important for sites which a substantial amount of content. It's usually the case that the vast majority of pages are only visited once or twice a day (the so called long-tail). Thus — chances are — they won't already be in the cache. You could argue this doesn't matter. After all if they are rarely in demand why worry about caching them?

The point is this: according to research by Amazon and Google even a 500ms delay could result in 20% less traffic. While 500ms may seem insignificant, 20% certainly isn't. Warming your cache is important: don't waste your users' time by having them do it.

Installing Boost

Boost is no different than any other Drupal module, download and extract it to your modules folder:

cd /var/www/drupal/6/drupal/sites/all/modules
wget http://ftp.drupal.org/files/projects/boost-6.x-1.17.tar.gz
tar -xzvf boost-6.x-1.17.tar.gz
rm boost-6.x-1.17.tar.gz

Enable the module in Drupal by checking Boost, under the Caching heading at:

http://www.yourwebsite.com/admin/build/modules

Now configure the Boost module at:

http://www.yourwebsite.com/admin/settings/performance/boost

I had to create a directory called 'cache' under my document-root with permission for my webserver to write it. The Drupal status report will tell you if anything is awry:

 http://www.yourwebsite.com/admin/reports/status

Once that's done you can start configuring Boost, it has a myriad of options. I'll explain what I changed in order to get the best for my specific setup.

Configuring Boost For Nginx

Firstly I turned off Gzip page compression as Nginx does this for me. Obviously there's another performance gain to be had by serving up pre-zipped content rather than have Nginx do it on-the-fly. However, for the sake of simplicity, we'll leave this off for now.

Next I disabled caching of XML, CSS and JavaScript. Drupal continues to do this more than adequately leaving static files under /sites/everita.com/files/ (assuming you've enabled bandwidth optimizations). Boost has only taken over page caching, nothing else.

Finally I enabled the cron crawler as discussed above. The rest I've left for the time being, clearly you can tailor the other options as you see fit.

So, Is It Working? Where Are My Cache Files?

Assuming your files are being cached under 'cache' (the default) you should begin to see .html files appearing. Note that if you're logged in — presumably as an administrator — you won't cause files to be cached as you meander through the site: you need to log out, browse the site, and check again.

cd /var/www/drupal/6/drupal/cache
find .

Configuring Nginx

As it stands you're now producing beautifully static .html files but as yet no one is reaping the benefits. We need to tell Nginx to serve cache files if they exist, reverting back to PHP and Drupal if they don't. Without any further hesitation here is that all important snippet from my configuration file:

/etc/nginx/sites-available/mysqlperformancetuning.conf
server {
  .
  .
  .

  set $boost "";
  set $boost_query "_";

  if ( $request_method = GET ) 
    set $boost G;
  }

  if ($http_cookie !~ "DRUPAL_UID") {
    set $boost "${boost}D";
  }

  if ($query_string = "") {
    set $boost "${boost}Q";
  }
  
  if ( -f $document_root/cache/normal/$host$request_uri$boost_query.html ) {
    set $boost "${boost}F";
  }

  if ($boost = GDQF){
    rewrite ^.*$ /cache/normal/$host/$request_uri$boost_query.html break;
  }
 
  if (!-e $request_filename) {
      rewrite ^/(.*)$ /index.php?q=$1 last;
      rewrite /(.*)/$ /index.php?q=$1 last;
      break;
  }
}

Credit and thanks go to Mechanix for a healthy amount of direction.

Essentially the above states that a cache file may be served under the following circumstances:

  • The request is a GET
  • You're an anonymous user and not logged in
  • There aren't any URL parameters
  • The file requested exists in the cache
  • Otherwise refer it on to Drupal as before

The $boost_query variable refers to 'Character used to replace "?"' under 'Generated output storage (HTML, XML, AJAX)' in Boost Settings for what it's worth.

That's it! I've a fairly basic site with equally simple URLs so your rules might become more complex but the principle is the same. Make sure you restart Nginx once you've made these modifications:

  /etc/init.d/fastcgi restart

Clearing The Cache

The strategy you use for clearing your cache is very dependant on the type of site you have. By default Boost will ignore calls from Drupal to clear the entire cache preferring to refresh it according to its own settings.

I've turned this off by setting 'Ignore cache flushing' to disabled. This lets me continue to use 'Clear cache data' to clear the entire cache when I tinker with the site's CSS for example. I'm a small site, it's less of an issue, my cache can be re-generated quickly. You might need to consider this more carefully. Rest assured Boost affords you plenty of control over when and how this happens.

Conclusion

You can see the difference this has made compared to yesterday's efforts with XCache below. Don't be fooled: you still need XCache or similar — especially if you deliver dynamic content — Boost can't help you there. If your content is predominately static however:

root@everita:~# ab -n 10000 -c 2 http://www.everita.com/

Server Software:        nginx/0.6.32
Server Hostname:        www.everita.com
Server Port:            80

Document Path:          /
Document Length:        25793 bytes

Concurrency Level:      2
Time taken for tests:   3.974 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      260060000 bytes
HTML transferred:       257930000 bytes
Requests per second:    2516.26 [#/sec] (mean)
Time per request:       0.795 [ms] (mean)
Time per request:       0.397 [ms] (mean, across all concurrent requests)
Transfer rate:          63904.10 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.2      0       2
Processing:     0    1   0.2      1       6
Waiting:        0    0   0.1      0       5
Total:          1    1   0.1      1       6

I could get a further superficial increase if I used the keep-alive option in ab (-k) but it's hardly worth it. As with any benchmark these should be taken with a pinch of salt. The point is, comparing like for like with yesterday's test, Boost is certainly worth considering.



Your Comments

typo

Hi there,

I used your config to tinker around with nginx + apache + boost + drupal. The speed increase is massive to say the least (350req./sec before, 12k req./sec after. yes, that's 12 _k_ ;) ). Thanks for putting the info together.

However, there are typos in your snippet which made nginx complain on init:

if ( $request_method = GET )
lacks a "{"

rewrite ^.*$ /cache/normal/$host/$request_uri$boost_query.html break;
"break;" -> newline?

thanks and best regards,
oliver


What i can use this config

What i can use this config with try_files directive instead of if (!-e $request_filename) {
location / {
/var/www/virtual/magazon.lg.ua/htdocs;
index index.php;
try_files $uri $uri/ @drupal;
}
location @drupal {
rewrite ^/(.*)$ /index.php?q=$1 last;
}
?


re: nginx boost config location?

Hello,

Thanks for your comments. I've updated the article accordingly.

You should be able to use the proxy_pass functionality to proxy requests that reach a certain location through to Apache.

http://wiki.nginx.org/NginxHttpProxyModule

Regards,
Steve


nginx boost config location?

Hi, any chance you could clarify where you put the boost nginx config in your example?

It's not clear if it's in nginx.conf or a sites-enabled/example.conf

It's also not clear what the context is, http, server, location etc...

I've got nginx setup proxying through to apache so need to know if nginx can intercept requests and serve static cache else pass on to apache.

Clarification there would be appreciated, thanks for the nice tutorial!


Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

Having MySQL performance issues?

We're experts at tuning MySQL and offer a MySQL performance consulting service.

LAMP stack not performing as you'd hoped?

Everita is experienced at getting the most out of your Linux, Apache, MySQL and Perl, PHP or Python setup. We're Drupal Experts.

Client Testimonials

Steve was knowledgeable and diligent in helping us identify application characteristics which were impacting MySQL's efficiency.

I would recommend him to anyone needing help optimising MySQL server and look forward to working with him in the future.

Richard Ainley
Performance Tester
WorkPlace Systems PLC

Next »

Subscribe

Enter your email address below to receive a very occasional message when something significant is published on the site.

You can unsubscribe at any time and we'll never share your address.

Contact Us

E: info@everita.com
L: Reading, United Kingdom

Linux & Mac Specialists

Images courtesy of Rowan Mersh