Question

  1. Anyone know the difference between redis replication and redis sharding?
  2. What are they use for? Redis stores data in memory, how does this affect replication/sharding?
  3. Is it possible to use both of them together?

Thank you!

Was it helpful?

Solution

Sharding is almost replication's antithesis, though they are orthogonal concepts and work well together.

Sharding, also known as partitioning, is splitting the data up by key; While replication, also know as mirroring, is to copy all data.

Sharding is useful to increase performance, reducing the hit and memory load on any one resource. Replication is useful for high availability of reads. If you read from multiple replicas, you will also reduce the hit rate on all resources, but the memory requirement for all resources remains the same. It should be noted that, while you can write to a slave, replication is master->slave only. So you cannot scale writes this way.

Suppose you have the following tuples: [1:Apple], [2:Banana], [3:Cherry], [4:Durian] and we have two machines A and B. With Sharding, we might store keys 2,4 on machine A; and keys 1,3 on machine B. With Replication, we store keys 1,2,3,4 on machine A and 1,2,3,4 on machine B.

Sharding is typically implemented by performing a consistent hash upon the key. The above example was implemented with the following hash function h(x){return x%2==0?A:B}.

To combine the concepts, We might replicate each shard. In the above cases, all of the data (2,4) of machine A could be replicated on machine C and all of the data (1,3) of machine B could be replicated on machine D.

Any key-value store (of which Redis is only one example) supports sharding, though certain cross-key functions will no longer work. Redis supports replication out of the box.

OTHER TIPS

In simple words, the fundamental difference between the two concepts is that Sharding is used to scale Writes while Replication is used to scale Reads. As Alex already mentioned, Replication is also one of the solutions to achieve HA.

Yes, they are both typically used together if you consider how shards can be replicated across nodes in a cluster.

With regard to your third question, instead of the RAM-flush option, it is a better idea to use the Redis Append Only File (AOF). At only a minor cost (in terms of write speed), you get a lot more reliability of your writes. It is quite like the mysql binary log. The 1 fsync/second is the recommended option to use.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top