Here at Logic Supply we recently started implementing a caching solution for the Web site. I’m relatively new to Web development in general and had no previous experience with memcache, but since it’s a pretty popular solution and sounded like a good fit for our needs, we figured we’d give it a shot. Read on to hear more about my adventures in memcache land…

Here at Logic Supply our Web site runs on PHP. PHP has a very simple and easy to use memcache library, however you do need to re-compile PHP for it to be supported. Luckily this is rather simple, just use the ‘–enable-memcache[=DIR]’ flag when you re-compile and you’re all set. Obviously this requires the ability to re-compile your version of PHP, so if you’re using hosting from a company that doesn’t give you access, you’re probably out of luck (but you may not need it in this case anyway…). Also, keep in mind that we’re running PHP 4, so there may be things I’m not aware of in PHP 5 with relation to memcache.

For those of you not really familiar with memcache, here’s a quick overview. You can actually figure out memcache’s big selling point just from reading the name: memcache caches in memory, making it faster than other caching solutions (like caching to the file system). It also supports connections over TCP, so you don’t have to cache to a local machine; you could have a memcache server running on your network and have your webserver cache your content there. Of course, running it locally on the webserver itself will certainly perform better since you’re not going over the network to store and retrieve information. But, you lose the benefit of having a central cache. However, that should only matter if you have a distributed group of servers that need access to the data, and going over the network isn’t going to slow you down that much. For our uses, running memcache on the webserver was the way to go, but for larger companies with a more distributed system it would probably be better to use a dedicated machine.

Using the PHP library for memcache is really simple; adding data, deleting data, and retrieving data couldn’t be easier. Every piece of data that you cache is keyed so that you can easily retrieve data based on that key. If you need to delete data individually you can do so simply by passing the key in, or you can invalidate the entire cache all at once using the flush() command. This leads me to a couple of the issues I had with the basic library provided by PHP:

  1. No easy way to group/namespace cached data
  2. No way to “get all keys,” or to see what information is currently stored (besides implementing this programmatically yourself—more on this later)

These are certainly not deal-breakers, but they would make living with memcache and PHP a little bit easier. For example, while memcache does not support groups/namespaces by default, you can simulate them without too much hassle. I ended up writing a wrapper layer around the bare-bones PHP library to add this functionality. I also stumbled upon something called memcachefs (albeit well after I had written my own fs caching scheme for debugging…), which lessens the annoyance of the second problem by allowing you to mount your memcache data locally and view, edit, add, or delete data as if they were right there on the file system. Since there doesn’t seem to be an easy way of querying memcache to see what is currently cached (or at least I wasn’t able to find a way), it can be a little tricky to develop with.

In my wrapper layer I implemented a few methods that also allowed me to invalidate grouped/namespaced data all at once without affecting other caches. This basically just gives me a little more flexibility and granularity with managing the cached data. I went through a lot of trial and error and flushing the cache to make sure that things were implemented correctly, and even went as far as implementing my own file system caching to make sure that things were working the way I thought they were. If you’re using Perl you could check out Memcache-Managed, which could save you the trouble of having to implement some of the group/namespacing stuff I had to (but how much fun would that be?).

Why does all this matter? First off, many of the pages on our Web site (//www.onlogic.com , in case you were wondering…) are pretty static; that is, the content on the pages doesn’t change all that often. Some of the pages go days or even weeks without changing, so caching the data just makes sense. Why hit the database when we don’t have to? The added speed is also welcome, as some of the processing we do can take a relatively large amount of time (we’re talking about seconds here, or even milliseconds, but it’s never fun to wait for a page to load). Reducing CPU load on the server is also never a bad idea. We could use the extra cycles to do good in the world, like run Folding@home or something.

All in all, memcache is a very sweet solution for our needs, and the few minor annoyances with the PHP implementation were just that; minor. The flexibility of memcache coupled with the performance gain shoud far outweigh any minor inconveniences in implementing the solution.

If you have any specific questions or comments about memcache, please leave them in the comments below and I’ll do my best to share my experience.