In MongoDB, what are the actual limitations imposed by unique indexes on shard keys?

https://stackoverflow.com/questions/19687049

01-07-2022
|

Pregunta

Mongo Docs read:

sharded systems cannot enforce cluster-wide unique indexes unless the unique field is in the shard key.

from here: http://docs.mongodb.org/manual/core/sharding-shard-key/

Still, it is rather vague to me if the shard key should be exactly the unique index or it can be a prefix of the unique index.

I found a lot of reference on this particular topic, but, unfortunately, I couldn't find a good "DO and DON'T" example.

To summarize, my question would reside in the following example: given a Mongo collection with a unique index on fields {a,b,c}, which of the following shard keys are right:

A. {a}
B. {a,b,c}
C. {a,b,c,d}
D. {a,b,d} ?

Thanks a lot.

Solución

The reason for this limitation is that it must be possible for the shards to check for duplicates without having to communicate with the other shards.

That means it must be clear for every possible value of the index on which shard it residues. A shard can only be certain that a value is unique when a colliding document would also be stored on itself.

That means that it is OK when the shard-key is only a part of the index, but not when the index is only a part of the shard-key.

For your examples, the shard-keys A and B would work, but C and D would not. When a shard in scenario C or D would receive a document where the fields a, b and c don't match anything it has, there could still be a document with the same values on a, b and c but a different value of d on another shard.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow