Question

I'm using MongoDB, version 2.4.8 on windows server 2008 R2 and I have strange index behaviour which I can't explain. Here example of structure that I have in my collection:

{
    "_id" : NUUID("67070100-4627-4aa5-8ab9-45624e5b82ad"), 
    "PropertyType" : "Cooperative",
    "Address" : {
        "Street" : "aaaaaaaaa",
        "HouseNo" : "165",
        "PostalCode" : 2860,
        "City" : "bbbbb",
        "Floor" : "1",
        "DoorNumber" : ""
    },
    "Sales" : {
        "Price" : 425000,
        "Payout" : 0,
        "AreaPrice" : 9042,
        "GrossPrice" : 2340,
        "NetPrice" : 800,
    },
    "WithdrawnFromSale" : true,
    "UnitData" : {
        "UnitType" : "aaaaa",
        "Area" : 400,
        "LivingArea" : 50,
        "UnitArea" : 50,
        "Rooms" : 2,
        "BuildYear" : 1948,
        "GroundArea" : 203,
        "NoiseLevel" : 5
    }
}

Also, I've created index for that collection:

db["UnitModel"].ensureIndex({ "Sales": 1, "PropertyType": 1, "UnitData.Rooms": 1, "UnitData.NoiseLevel": 1 })

The problem with that index is that I get wrong count of items when using this index.

When I issue this request:

db.UnitModel.find({Sales: {$ne: null}, WithdrawnFromSale: false}).explain({verbose: true})

I get following results:

{
    "cursor" : "BtreeCursor Sales_1_PropertyType_1_UnitData.Rooms_1_UnitData.NoiseLevel_1 multi",
    "isMultiKey" : false,
    "n" : 19368,
    "nscannedObjects" : 42875,
    "nscanned" : 42876,
    "nscannedObjectsAllPlans" : 43274,
    "nscannedAllPlans" : 43276,
    "scanAndOrder" : false,
    "indexOnly" : false,
    ....
}

Here we can see that index has been used, but the number of items returned is "n" : 19368. which is wrong. It should be 70986 items in collection with that criteria.

Why am I sure that it should be more records? Well, here the code:

var totalCount = 0;
db.UnitModel.find({WithdrawnFromSale: false}).forEach(
  function (e) {      
    if(e.hasOwnProperty('Sales') && e.Sales != null)
      totalCount++;
  }
)
totalCount;

totalCount = 70986

To be sure that query above do not use any indexes let's check it out:

db.UnitModel.find({WithdrawnFromSale: false}).explain({verbose: true})

And result:

{
    "cursor" : "BasicCursor",
    "isMultiKey" : false,
    "n" : 70986,
    "nscannedObjects" : 3204212,
    "nscanned" : 3204212,
    "nscannedObjectsAllPlans" : 3204212,
    "nscannedAllPlans" : 3204212,
    "scanAndOrder" : false,
    "indexOnly" : false,
    ....
}

So, for UnitModel collection I'm using, for criteria: Sales: {$ne: null}, WithdrawnFromSale: false it should be 70986 records returned by mongo. But as you can see I get it wrong.

Can someone explain me why? What can be the reason?

BTW. When I drop that index and use following index: db["UnitModel"].ensureIndex({ "WithdrawnFromSale": 1}) it works as expected. But I do not need that index, it's not optimzal for my case.

Was it helpful?

Solution

As at MongoDB 2.4, the maximum size of an indexed value is 1024 bytes. The current behaviour for a key too large to index is to log a warning on the server side -- but this does not throw an exception. In this case, documents with excessively long keys will not be included in the index when the key is too long, but will be included in other indexes. This can lead to inconsistencies in results such as incorrect counts and "missing documents" that cannot be found by one index but may be available in another index or with a $natural search.

In the MongoDB 2.5 development/unstable branch (which will culminate in the MongoDB 2.6 production release later this year) this behaviour has changed. As at MongoDB 2.5.5, an exception will now be raised if a insert/update includes an index update where the keys would be too large. See SERVER-5290 in the MongoDB issue tracker for more details.

OTHER TIPS

Figure out what the reason of the issue. When I look in log files for monogodb I have seen tons of following messages:

HBReadModel.system.indexes Btree::insert: key too large to index, skipping HBReadModel.UnitModel.$Sales_1_WithdrawnFromSale_1_PropertyType_1_UnitData.Rooms_1_UnitData.NoiseLevel_1

I was trying to create index on sales field which in actually document and not field. To avoid this I just re-created index and specify field inside Sales document. Log is clear, query returns records as expected.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top