What is a good way to manage keys in a key-value store?

Question

The generally accepted convention in the Redis world is option 1 - i.e. namespaces separated by a character such as colon. That said, the namespaces are almost always one level deep. For example : person:12321 instead of human_resources:person:12321.

How does this work with the 4 guidelines you set?

Shardable - This approach is shardable. Each key can get into a different shard or same shard depending on how you set it up.

Namespaced Namespace as a way to avoid collisions works with this approach. However, namespaces as a way to group keys doesn't work out. In general, using keys as a way to group data is a bad idea. For example, what if the person moves from department to another? If you change the key, you will have to update all references - and that gets tricky.

Its best to ensure the key never changes for an object. Grouping can then be handled externally by creating a separate index.

For example, lets say you want to group people by department, by salary range, by location. Here's how you'd do it -

Individual people go in separate hash with keys persons:12321
Create a set for each group by - For example : persons_by:department - and only store the numeric identifiers for each person in this set. For example [12321, 43432]. This way, you get the advantages of Redis' Integer Set

Efficient The method explained above is pretty efficient memory wise. To save some more memory, you can compress the keys further on the application side. For example, you can store p:12321 instead of persons:12321. You should do this only if you have determined via profiling that you need such memory savings. In general, it isn't worth the cost.

Collision Free This depends on your application. Each User or Person should have a primary key that never changes. Use this in your Redis key, and you won't have collisions.

You mentioned two problems with this approach, and I will try to address them

What if the id has a colon?

It is of course possible, but your application's design should prevent it. Its best not to allow special characters in identifiers - because they will be used across multiple systems. For example, the identifier will very likely be a part of the URL, and colon is a reserved character even for urls.

If you really must allow special characters in your identifier, you would have to write a small wrapper in your code that encodes the special characters. URL encoding is perfectly capable of handling this.

Size Efficiency

There is a cost to long keys, however it isn't too much. In general, you should worry about the data size of your values rather than the keys. If you think keys are consuming too much memory, profile the database using a tool like redis-rdb-tools.

If you do determine that key size is a problem and want to save the memory, you can write a small wrapper that rewrites the keys using an alias.