The solution to performance dilemmas is usually to benchmark both solutions.
In this case though, I'd say the cache approach makes a lot more sense: the time complexity of in_array
is O(N)
, that is, a linear sweep. On the other hand, Caches are usually implemented as hash tables, where lookup is O(1)
.
Also, if you aggregate the records in Memcached, you'll avoid wasting a lot of RAM duplicating the list in memory once per web worker process.
It would also arguably be a much cleaner solution.
On a side note, did you consider doing this at another level? With some light scripting, you could do your checks at the LB (e.g. Nginx) level.