Question

Our (edX) original Mongo persistence representation uses a bson dictionary (aka object or subdocument) as the _id value (see, mongo/base.py). This id is missing a field.

  1. Can some documents' _id values have more subfields than others without totally screwing up indexing?
  2. What's the best way to handle existing documents without the additional field? Remove and replace them? Try to query w/ new _id format and if fails fall over to query w/o the new field? Try to query with both new and old _id format in one query?

To be more specific, the current format is

{'_id': {
    'tag': 'i4x', // yeah, it's always this fixed value
    'org': your_school_x,
    'course': a_catalog_number,
    'category': the_xblock_type,
    'name': uniquifier_within_course
}}

I need to add 'run': the_session_or_term_for_course_run or 'course_id': org/course/run.

Was it helpful?

Solution

Documents within a collection need not have values for _id that are of the same structure. Hence, it is perfectly acceptable to have the following documents within a collection:

> db.foo.find()
{ "_id" : { "a" : 1 } }
{ "_id" : { "a" : 1, "b" : 2 } }
{ "_id" : { "c" : 1, "b" : 2 } }

Note that because the index is on only _id, only queries that specify a value for _id will use the index:

db.foo.find({_id:1}) // will use the index on _id
db.foo.find({_id:{state:"Alaska"}) // will use the index on _id
db.foo.find({"_id.a":1})  // will NOT use the index on _id

Note also that only a complete match of the "value" of _id will return a document. So this returns no documents for the collection above:

db.foo.find({_id:{c:1}})

Hence, for your case, you are welcome to add fields to the object that is the value for the _id key. And it does not matter that all documents have a different structure. But if you are hoping to query the collection by_id and have it be efficient, you are going to need to add indexes for all relevant sub parts that might be used in isolation. That is not super efficient.

_id is no different than any other key in this regard.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top