Question

I have two fields in a document I want to index. One of them is Receive Time, and the other one is Serial Number. I want users to be able to query on Serial Number alone or on both Serial Number and Receive Time.

The way I see it, I have two options.

A.

db.collection.ensureIndex({SerialNumber: 1, ReceiveTime: 1}) db.collection.ensureIndex({ReceiveTime: 1})

B.

db.collection.ensureIndex({ReceiveTime: 1, SerialNumber: 1}) db.collection.ensureIndex({SerialNumber: 1})

Apparently, option A is a better choice (you want fields with low uniqueness to be later on in an index) versus option B. Why is that the case?

However, at the same time the MongoDB documentation states that if your index increments then the whole index need not fit in RAM. If this is a very write heavy application, would B then be the better option? (compound indexes are larger than single indexes and the compound index increments as opposed to A which doesn't increment)

Was it helpful?

Solution

The decision between {SerialNumber: 1, ReceiveTime: 1} and {ReceiveTime: 1, SerialNumber: 1} should be based on the type of queries that you plan to perform. If you generally query for a specific SerialNumber but a large range of possible ReceiveTimes, then you want to use {SerialNumber: 1, ReceiveTime: 1}. Conversely, if your queries are specific for ReceiveTime but more general for SerialNumber then go for {ReceiveTime: 1, SerialNumber: 1}. This way each query is likely to require fewer pages of the index, and will minimize the amount of swapping that the OS has to do.

Similarly, if you are always querying by, say, the most recent ReceiveTimes, then you can keep the working set small by using {ReceiveTime: 1, SerialNumber: 1}. You will only need to keep the pages corresponding to the most recent ReceiveTimes in memory. This is what the documentation you linked to is suggesting.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top