Hashes would be the natural fit here, primarily simply because that data structure is aimed at multiple named values that share an overall identity (including expiration etc). There isn't a huge performance difference if you are currently using MGET
- for hashes you would simply us HMGET
, HGETALL
, and HMSET
.
I do not see that this changes anything in terms of making it more complicated: you simply populate your intended changes into a Dictionary<string,byte[]>
and use .Hashes.Set(...)
. once rather than calling .Strings.Set
multiple times. Likewise, using the varadic form of .Strings.Get(...)
is not very different to calling .Hashes.GetAll(...)
or the varadic form of .Hashes.Get(...)
.
Nor do I accept that this code will be any slower - indeed, it is basically identical. Indeed, at the implementation level, a single call to .Hashes.Set
involves less overhead in terms of Task
etc, since it is a single waitable/awaitable operation.
Currently the forum at peak hours receives about 12 messages / minute and 20 votes / minute(total votes not per message)
That throughput should not present an issue. Redis works happily at many 10s (or 100s) of thousands of messages per second.
Do you think it worth to reimplement the message store/retrieve procedure by using hashes or the solution than I'm using right now will be able to scale fine when for example the rate of message inserts increase with 30 messages/minute?
That message rate should not be problematic. If you are seeing issues, please elaborate. However, the simplest and most appropriate next step would be to simulate some much higher load - see what works.
In essence can you provide some guidelines on how stackoverflow handles this situations?
We generally use a SQL database as our primary data store (although some things are kept solely in redis). We use redis extensively to store processed items as a cache, but since they are not subject to change, we do not store them field-by-field: instead, we use protobuf-net against DTO types and store blobs of data (using the string type, i.e. GET
/SET
). Further, if the size is above a threshold (and as long as it isn'g going into a set/sorted-set), we do a quick "gzip" test, to see if it gets smaller if we compress it (not everything does): if it does, we store that - so we have the absolute minimum of bandwidth and storage overhead, and very fast processing at store / fetch. For clarity, the reason we do not compress in sets/sorted-sets is that gzip does not guarantee the exact same output each time, which upsets the hashing.