Assuming that the cache and the RDBMS expose the same interface as the client itself, this code for handling puts and gets is straightforward. On a get, check if the cache has the value; if not, go to the database and cache the value retrieved if it exists. On a put, issue a put on the database and invalidate the entry in the cache. In barely 10 lines of real code, we already see a race condition that can result in stale cache data. For example, suppose the database currently has the pair ("k", "v1"), the cache does not have "k", and two clients A and B are simultaneously getting the key "k" and putting the pair ("k", "v2"), respectively. Then the following sequence is possible:
- A sees that the cache does not contain the key "k".
- A gets the value "v1" from the RDBMS.
- B puts the pair ("k", "v2") into the RDBMS.
- B removes the key "k" from the cache (was not there in the first place).
- A caches the pair ("k", "v1") and returns.
So client A receives the value "v1", which is fine, but it also leaves that value in the cache after B has already updated it, meaning that future gets for key "k" will return stale data. It turns out that using memcached on top of an RDBMS is not quite as trivial as the code snippet above; to fix this race condition we can leverage the gumball technique.
The idea behind the gumball technique is clever: the cache server itself sees all of these operations (the get that misses, the remove, and then the put from the original getter), so it can deduce which ones will result in stale data. The only change to the protocol itself is to return a miss time $T_m$ (server time) on the first get and require the client to pass $T_m$ with the corresponding put, which we store along with the value in the cache. The rest of the work is done by the cache server itself and consists of two parts: managing "gumball" entries and rejecting certain put operations. Gumball entries are timestamped removal markers; every time a client issues a remove on a key, instead of just deleting the key, the server puts one of these gumballs that has the timestamp of the removal (again, server time). The time-to-live (TTL) of the entry, which we will refer to as $\Delta$, has a special way of being determined, but we can assume it is just some value that exceeds the typical length of a RDBMS query.
The tricky part of the algorithm comes when a put operation is issued (with a corresponding $T_m$ value). Let $T_C$ be the current time on the cache server. If $T_C - T_m > \Delta$, we ignore the put operation because a potential gumball entry could have been created in the window that has already expired. This is why $\Delta$ should be larger than the typical get/put cycle. If there is a gumball entry whose timestamp exceeds $T_m$, then it is likely that the race condition occurred and we should ignore the put operation. Lastly, if there is an actual value with miss time greater than $T_m$, we also have to ignore the put operation as the other put may have overridden a gumball entry that would have invalidated the current put. If none of these conditions hold, then we insert the key/value pair into the cache as normal (along with $T_m$).
These steps guarantee that the cache will never have a stale entry that persists following an update or delete, which brings back the consistency provided by an RDBMS in the first place. The paper goes into a bit more detail about how to choose $\Delta$ properly via a sliding window algorithm with some buffer, but the exact value is not important for correctness. It also discusses the performance impact, which seems to be negligible under normal loads, but does depend on how long the RDBMS queries take relative to the frequency of gets. The gumball technique is a nice example of leveraging minimal centralization and carefully observing access patterns to produce a simple way of ensuring distributed consistency. It would be great if memcached implementations actually started doing this, but since it requires an API change, it will probably not become widespread anytime soon. And since many websites don't actually care about this level of consistency, it is hard to say whether this will ever become standard.
The tricky part of the algorithm comes when a put operation is issued (with a corresponding $T_m$ value). Let $T_C$ be the current time on the cache server. If $T_C - T_m > \Delta$, we ignore the put operation because a potential gumball entry could have been created in the window that has already expired. This is why $\Delta$ should be larger than the typical get/put cycle. If there is a gumball entry whose timestamp exceeds $T_m$, then it is likely that the race condition occurred and we should ignore the put operation. Lastly, if there is an actual value with miss time greater than $T_m$, we also have to ignore the put operation as the other put may have overridden a gumball entry that would have invalidated the current put. If none of these conditions hold, then we insert the key/value pair into the cache as normal (along with $T_m$).
These steps guarantee that the cache will never have a stale entry that persists following an update or delete, which brings back the consistency provided by an RDBMS in the first place. The paper goes into a bit more detail about how to choose $\Delta$ properly via a sliding window algorithm with some buffer, but the exact value is not important for correctness. It also discusses the performance impact, which seems to be negligible under normal loads, but does depend on how long the RDBMS queries take relative to the frequency of gets. The gumball technique is a nice example of leveraging minimal centralization and carefully observing access patterns to produce a simple way of ensuring distributed consistency. It would be great if memcached implementations actually started doing this, but since it requires an API change, it will probably not become widespread anytime soon. And since many websites don't actually care about this level of consistency, it is hard to say whether this will ever become standard.
It seems like there are other ways to handle these race conditions, like buffering writes to the cache until the end of the RDMS transaction. Then one would
ReplyDelete1. remove all keys from the cache that are about to be modified
2. commit the transaction
3. add to the cache all new cached keys.
Then gets will either
A. read before #1, which is fine because they see no changes in the cache and RDMS (provided a correct isolation level)
B. read between #1 and #2, which does a cache miss and reads old data from the RDMS. There's a problem here where this second get could then overwrite the cache with stale data, and the way people seem to tackle this with memcached is to have memcached take a remove lock so that keys recently removed by one connection cannot be accessed by a different connection before either some time has passed or the original connection readds the keys, whichever comes first. So the second get would then lock upon trying to read the key in contention until the first get repopulates it.
C. read between #2 and #3, which falls under B except that the second get now has up to date information.
D. read after #3, which is of course fine.
This is how I tackled this consistency problem, but the gumball technique looks more elegant. I'll have to read more about it.
I think the gumball technique is just a variation on what you've described in B. The gumball entry that is put into the cache when you remove a key tells other clients that they have to be wary about putting stale data into the cache (a weaker form of a lock).
ReplyDelete