Check your cache before you wreck yourself

Cache invalidation is known as one of the very few hard things in computer science.

It seems to be a common misconception that Drupal's cache_get checks whether a given cache entry has expired, and won't return a stale result. In fact, in Drupal this is not always the case.

The docs for both D6 and D7 actually say that if a specific timestamp is given as the $expire parameter in a cache_set, that this "Indicates that the item should be kept at least until the given time, after which it behaves like CACHE_TEMPORARY.". [D6/D7]

So this does not say that cache entries will expire (i.e. cache_get will not return them) after this timestamp has passed; rather it says that "the item should be removed at the next general cache wipe."

What this actually means is that it's the responsibility of the code which does a cache_get to check whether any object that it gets back is still valid in terms of the time it should expire.

So, if you want to use Drupal's cache system in D6 or D7 to store a value for a short amount of time, but not wait for the cache entry to be cleared until "the next general cache wipe", you must check the expire timestamp on any cache object that you receive back from a cache_get.

Here's a little php script which illustrates this; we still get a cache object back even although it has expired:

<?php
 
define('TEST_CACHE_LIFETIME', 10); // seconds
if (!defined('REQUEST_TIME')) {
  // REQUEST_TIME is in D7 but not D6
  define('REQUEST_TIME', time());
}
print "\n###\nrunning cache test at " . REQUEST_TIME . "\n";
 
$reset_cache = FALSE;
if($cached = cache_get('test_cache_expiry', 'cache'))  {
  print 'this came from cache: ' . print_r($cached, TRUE);
  if ($cached->expire < REQUEST_TIME) {
    $reset_cache = TRUE;
    print "cached data has expired; resetting\n";
  }
}
else {
  $reset_cache = TRUE;
}
 
if ($reset_cache) {
  print 'setting this to cache: ' . ($data = md5(rand())) . "\n";
  cache_set('test_cache_expiry', $data, 'cache', REQUEST_TIME + TEST_CACHE_LIFETIME);
}

...and here's what happens if we run it a few times in quick succession:

$ for i in {1..8}; do drush scr cache_test.php; sleep 3; done
 
###
running cache test at 1384557409
setting this to cache: 5d9f014b374764e35220ead02102b1e7
 
###
running cache test at 1384557412
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => 5d9f014b374764e35220ead02102b1e7
    [created] => 1384557409
    [expire] => 1384557419
    [serialized] => 0
)
 
###
running cache test at 1384557416
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => 5d9f014b374764e35220ead02102b1e7
    [created] => 1384557409
    [expire] => 1384557419
    [serialized] => 0
)
 
###
running cache test at 1384557419
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => 5d9f014b374764e35220ead02102b1e7
    [created] => 1384557409
    [expire] => 1384557419
    [serialized] => 0
)
 
###
running cache test at 1384557422
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => 5d9f014b374764e35220ead02102b1e7
    [created] => 1384557409
    [expire] => 1384557419
    [serialized] => 0
)
cached data has expired; resetting
setting this to cache: a57b9e9734824207e0aa6d4d6a4b6973
 
###
running cache test at 1384557426
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => a57b9e9734824207e0aa6d4d6a4b6973
    [created] => 1384557422
    [expire] => 1384557432
    [serialized] => 0
)
 
###
running cache test at 1384557429
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => a57b9e9734824207e0aa6d4d6a4b6973
    [created] => 1384557422
    [expire] => 1384557432
    [serialized] => 0
)
 
###
running cache test at 1384557433
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => a57b9e9734824207e0aa6d4d6a4b6973
    [created] => 1384557422
    [expire] => 1384557432
    [serialized] => 0
)
cached data has expired; resetting
setting this to cache: abbe82035a1bcaea187259f316f04309

Note that not all cache backends work the same - memcache doesn't seem to return cache entries after their expire timestamp has passed, for example.

We should assume, however, that we might well get a cache object which has expired back from cache_get, so we should always check the expire property before assuming that the cache entry is valid

See https://drupal.org/node/534092 for some discussion as to whether this is a bug or a feature.

Comments

This post clarified a

This post clarified a commonly misunderstood nuance in the API, thank you!