Question

I have a buffer that needs to read all values(hash, field and values) from redis after reboot, is there a way to do that in a fast way? I have approximately 100,000 hashes with 4 fields each.

Thanks!

EDIT: The Slow way: Current Implementation is getting all the hashes using

Keys *

then

HGETALL xxx

to get all the fields' values.

Était-ce utile?

La solution

There are two ways to approach this problem.

The first one is to try to optimize the KEYS/HGETALL combination you have described. Because you do not have millions of keys (100K is not so high by Redis standard), the KEYS command will not block the instance for a long time, and the output buffer size required to return 100K items is probably acceptable. Once the list of keys have been received by your program, then the next challenge is to run many HGETALL commands as fast as possible. The key is to pipeline them (for instance in synchronous batches of 1000 items) which is quite easy to implement with hiredis (just use redisAppendCommand / redisGetReply). The 100K items will be retrieved in 100 roundtrips only. Because most Redis instances can sustain 100K op/s or more, it should not last more than a few seconds. A more efficient solution would be to use the asynchronous interface of hiredis to try to maximize the throughput, but it is more complex to implement. I'm not sure it is worth it for 100K items.

The second approach is to use a BGSAVE command to take a snapshot of Redis content, retrieve the generated dump file, and then parse the file to extract the data. You can have a look at the excellent redis-rdb-tools package for a Python implementation. The main benefit of this approach is there is no impact on the Redis instance (no KEYS command to block the event loop) while still retrieving consistent data.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top