문제

I would like to store key-value pair in Cassandra and have entry Automatically deleted in LRU fashion when a fixed storage size is reached.

Is it possible to do this using Cassandra, if so what would be the best way to do it. If not is there any other distributed storage system that would support this use-case while not having to keep all data in memory.

도움이 되었습니까?

해결책

The short answer is, no, Cassandra does not support LRU out of the box.

You could, if you really wanted to, build a LRU layer in your app on Cassandra to accomplish the same effect. This could be done several ways, but generally you'd want to maintain a separate index of the cache objects along with stats/timestamps and have your app purge objects as appropriate. Even then, disk space wouldn't be a good upper limit due to the nature of how Cassandra stores its data and manages updates, deletes, etc. Cassandra doesn't free storage immediately on a Delete, rather it sets a tombstone and the old data is removed later (About Deletes).

One advantage to building a custom caching layer (i.e. in Cassandra) would be that you could move beyond simple LRU eviction and ensure that more expensive objects and or objects that are used more frequently (but not recently) are weighted heavier so that they remain in the cache longer even if they might be purged by LRU. Whether this would be useful or not would depend completely on your specific use case. But again, Cassandra can get bloated by a lot of data churn and one would need to ensure that the cluster is properly tuned and getting its routine maintenance.

In reality most would deploy Memcache (or similar) to support this use case.

다른 팁

Cassandra can be used as LRU, you just need to use TTL or manage the removal yourself.

New data is always appended. The removed data is only marked as deleted and is physically removed during compaction. You might need to tweak the compaction.

The advantage of Cassandra is that data is persisted the moment it comes in, you don't have to fit all data into memory except extreme use cases, you can use replication not to lose data, and you can access it from multiple languages. Beware that new data may be not immediately available.

A more lightweight approach is Redis.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top