
Caching with PHP
PHP is a great scripting environment for websites. It has become more and more widespread and is now the standard for web hotels. At most web hotels you will also get access to a MySQL-server at a rather inexpensive price. Unfortunately, these cheap MySQL-servers are often slow due to the popularity of cheap web hotels. Luckily, PHP has a built-in caching mechanism giving you an easy way to minimize your load on the database and thereby giving you much faster responses from your website.
All Those Database Queries
Often a single page consists of many database queries. For instance, in order to create an index at my website the script makes a couple of queries:
- First it receives the page information: Title, descriptive text, language, keywords etc.
- Secondly, it queries the database for sub sites with the chosen language setting and shows the title and description of the first five selected rows.
- If the second query responded with no selected rows the database is then queried for any available sub sites without looking at language preferences and shows the first five selected rows.
Consider now that every website hosted by the web hotel now has a website that does something similar. That is indeed a lot of load resulting in slow responses from your website. Slow responses are pretty bad considering how impatient web users generally are. Therefore, it is in everyone's interest to cache your website.
Problems with Caching
With the caching I propose, the load is moved from the database and to the file system instead. This introduces a couple of problems:
- Caching takes up space in the already limited file system. Your web hotel might not provide excessive hard drive space.
- File reading and writing is not necessarily stunningly fast unless you of course run a RAM drive
- Updating the database demands an update of the cache as well. This might result in trouble with wrong file permissions.
- Personalization can cause trouble if you want to use cache. For instance, the Language Settings page is not cached due to it's reading of cookies and HTTP headers.
Caching done in this way might be smart but for big web sites it is much smarter to just run a reverse proxy like Squid. It is fast and allows for caching in the memory which is again a lot faster than using the file system. With Squid or another proxy (or http accelerator) you can also forget about possible issues with PHP's performance.
My Way of Caching
When I redesigned krath.dk, I quickly realised that I had to go with caching. At first I played around with Cache_Lite from PEAR, the PHP Extension and Application Repository. It looked very good and was object-oriented so it must have good performance. Devshed wrote about it, the Danish music magazine Gaffa uses it. All in all it seemed to be the way to go.
A couple of days later, I stumbled upon the web site of Morten Blinksbjerg Nielsen and there I saw how he caches his content (Danish description). Instead of using lots of functions from Cache_Lite, he uses PHP's object buffer functions to cache the content. It is in fact much easier than it might sound. I decided to do something similar with my website.
The Actual Code
Here is the code as I wrote it. Reading the comments should explain most of it, I hope.
$cache_filename =
$_SERVER["DOCUMENT_ROOT"] . "/cache/" . $get_parent . "+" . $get_name . $page_index_number . "." . $cache_language . ".html";
// /cache/linux+php_caching.en.html
if(file_exists($cache_filename)) { // If the requested page has
already been cached...
readfile($cache_filename); // ... then print the content of the
file to the screen
exit(); // Stop processing the script
} else {
ob_start(); // Start the buffering (PHP Manual about ob_start)
}
[...]
[ The rest of the file goes here ]
[...]
if ($page_not_found == 0) { // I do not want error pages to be
cached!
$fp = fopen($cache_filename,
"w"); // open the cache-file for
writing
$buffer = ob_get_contents(); // assign the content of the buffer
to $buffer
ob_end_flush(); // clear the buffer and show the content
fwrite($fp, $buffer); // write the buffer to the file
fclose($fp);
}
Measured Performance Improvements
To measure if there was any improvements and how big these would be, I used the Apache Benchmark utility already distributed with the Apache Web Server. I used the following arguments: ab -n 10000 -k -c 100 http://192.168.0.1/writing/. This tells the benchmark utility to make 10 000 keep-alive requests to the index of writing and to open 100 connections at the same time.
The test was run on between two computers on my local area network to try to resemble the "real" Internet. Of course, the traffic congestion on my local network is much lower than the one on the Internet, and different results will occur if you run this test on a Internet web hotel. The results are great for comparison here though.
The results
With the caching commented out, Apache Benchmark was able to complete about 60 requests per second with a transfer rate of 302.68 kilobytes per second.
After that, I ran Apache Benchmark with the same arguments, but this time with the cache enabled. Now it was possible to make more than twice as many requests per second to the server resulting in 132 requests per second and a transfer rate of 665.87 kilobytes per second. That is quite a speed improvement.
Interestingly, 255 requests were lost when I had the caching turned off. Whether this is due to bad network equipment or simply due to the high load on the server, I do not know. The benchmarked Apache 2.0.40 and MySQL 3.23.56 was nevertheless running on my workstation, which also ran the memory hogging applications OpenOffice.org, Mozilla, and Nautilus. These circumstances did not change between the two benchmarks.
Questions and Comments
Feel free to contact me if you have improvements, comments or questions to the above. I would also love to hear from you if you found the above code or article useful.