First, look at the bytes of an ObjectId
:
ObjectId is a 12-byte BSON type, constructed using:
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier,
a 2-byte process id, and
a 3-byte counter, starting with a random value.
So, if you create a series of ObjectId
s on a single machine rapidly, you'll end up creating essentially the same _id
as it will contain nearly the same bytes, with the exception of a 3-byte counter (as the timestamp, process id, and machine identifier will all be the same).
Most MongoDB drivers/clients create _id
s locally by default, rather than on the Database server (and as they are all open source, you can take a look at each of the implementations to see the specifics of how the _id
is generated. Sometimes, admittedly, it does require a bit of digging. Here's one for NodeJS for example).
The _id
that was generated from the shell is no more "real" than the _id
s generated from the client. They just have different seeding values (the machine and process Ids would of course be different).
By using an ObjectId
for sharding, as of 2.4+ you've got two choices: range and hash. Each has it's pros and cons and ultimately, either work very well, depending on the nature of the writes, reads, queries, etc. that are needed. You can read more about that here.