Image related to this topic...

Caching with PHP

PHP is a great scripting environment for websites. It has become more and more widespread and is now the standard for web hotels. At most web hotels you will also get access to a MySQL-server at a rather inexpensive price. Unfortunately, these cheap MySQL-servers are often slow due to the popularity of cheap web hotels. Luckily, PHP has a built-in caching mechanism giving you an easy way to minimize your load on the database and thereby giving you much faster responses from your website.

All Those Database Queries

Often a single page consists of many database queries. For instance, in order to create an index at my website the script makes a couple of queries:

Consider now that every website hosted by the web hotel now has a website that does something similar. That is indeed a lot of load resulting in slow responses from your website. Slow responses are pretty bad considering how impatient web users generally are. Therefore, it is in everyone's interest to cache your website.

Problems with Caching

With the caching I propose, the load is moved from the database and to the file system instead. This introduces a couple of problems:

Caching done in this way might be smart but for big web sites it is much smarter to just run a reverse proxy like Squid. It is fast and allows for caching in the memory which is again a lot faster than using the file system. With Squid or another proxy (or http accelerator) you can also forget about possible issues with PHP's performance.

My Way of Caching

When I redesigned krath.dk, I quickly realised that I had to go with caching. At first I played around with Cache_Lite from PEAR, the PHP Extension and Application Repository. It looked very good and was object-oriented so it must have good performance. Devshed wrote about it, the Danish music magazine Gaffa uses it. All in all it seemed to be the way to go.

A couple of days later, I stumbled upon the web site of Morten Blinksbjerg Nielsen and there I saw how he caches his content (Danish description). Instead of using lots of functions from Cache_Lite, he uses PHP's object buffer functions to cache the content. It is in fact much easier than it might sound. I decided to do something similar with my website.

The Actual Code

Here is the code as I wrote it. Reading the comments should explain most of it, I hope.

$cache_filename = $_SERVER["DOCUMENT_ROOT"] . "/cache/" . $get_parent . "+" . $get_name . $page_index_number . "." . $cache_language . ".html";
// /cache/linux+php_caching.en.html

if(file_exists($cache_filename)) { // If the requested page has already been cached...
  readfile($cache_filename);       // ... then print the content of the file to the screen
  exit();                          // Stop processing the script
} else {
  ob_start();                      // Start the buffering (PHP Manual about ob_start)
}

[...]

[ The rest of the file goes here ]

[...]

if ($page_not_found == 0) {           // I do not want error pages to be cached!
  $fp = fopen($cache_filename, "w");  // open the cache-file for writing
  $buffer = ob_get_contents();        // assign the content of the buffer to $buffer
  ob_end_flush();                     // clear the buffer and show the content
  fwrite($fp, $buffer);               // write the buffer to the file
  fclose($fp);
}

Measured Performance Improvements

To measure if there was any improvements and how big these would be, I used the Apache Benchmark utility already distributed with the Apache Web Server. I used the following arguments: ab -n 10000 -k -c 100 http://192.168.0.1/writing/. This tells the benchmark utility to make 10 000 keep-alive requests to the index of writing and to open 100 connections at the same time.

The test was run on between two computers on my local area network to try to resemble the "real" Internet. Of course, the traffic congestion on my local network is much lower than the one on the Internet, and different results will occur if you run this test on a Internet web hotel. The results are great for comparison here though.

The results

With the caching commented out, Apache Benchmark was able to complete about 60 requests per second with a transfer rate of 302.68 kilobytes per second.

After that, I ran Apache Benchmark with the same arguments, but this time with the cache enabled. Now it was possible to make more than twice as many requests per second to the server resulting in 132 requests per second and a transfer rate of 665.87 kilobytes per second. That is quite a speed improvement.

Graph comparing the benchmark results with caching on and off.

Interestingly, 255 requests were lost when I had the caching turned off. Whether this is due to bad network equipment or simply due to the high load on the server, I do not know. The benchmarked Apache 2.0.40 and MySQL 3.23.56 was nevertheless running on my workstation, which also ran the memory hogging applications OpenOffice.org, Mozilla, and Nautilus. These circumstances did not change between the two benchmarks.

Questions and Comments

Feel free to contact me if you have improvements, comments or questions to the above. I would also love to hear from you if you found the above code or article useful.