Question

what seemed a simple task, came to be a challenge for me.

I have the following mongodb structure:

{
(...)
"services": {
    "TCP80": {
      "data": [{
          "status": 1,
          "delay": 3.87,
          "ts": 1308056460
        },{
          "status": 1,
          "delay": 2.83,
          "ts": 1308058080
        },{
          "status": 1,
          "delay": 5.77,
          "ts": 1308060720
        }]
    }
}}

Now, the following query returns whole document:

{ 'services.TCP80.data.ts':{$gt:1308067020} }

I wonder - is it possible for me to receive only those "data" array entries matching $gt criteria (kind of shrinked doc)?

I was considering MapReduce, but could not locate even a single example on how to pass external arguments (timestamp) to Map() function. (This feature was added in 1.1.4 https://jira.mongodb.org/browse/SERVER-401)

Also, there's always an alternative to write storedJs function, but since we speak of large quantities of data, db-locks can't be tolerated here.

Most likely I'll have to redesign the structure to something 1-level deep, like:

{
   status:1,delay:3.87,ts:138056460,service:TCP80
},{
   status:1,delay:2.83,ts:1308058080,service:TCP80
},{
   status:1,delay:5.77,ts:1308060720,service:TCP80
}

but DB will grow dramatically, since "service" is only one of many options which will append each document.

please advice!

thanks in advance

Was it helpful?

Solution

This is not currently supported. By default you will always receive the whole document/array unless you use field restrictions or the $slice operator. Currently these tools do not allow filtering the array elements based on the search criteria.

You should watch this request for a way to do this: https://jira.mongodb.org/browse/SERVER-828

OTHER TIPS

In version 2.1 with the aggregation framework you are now able to do this:

1: db.test.aggregate(
2:   {$match : {}},
3:   {$unwind: "$services.TCP80.data"},
4:   {$match: {"services.TCP80.data.ts": {$gte: 1308060720}}}
5: );

You can use a custom criteria in line 2 to filter the parent documents. If you don't want to filter them, just leave line 2 out.

I'm attempting to do something similar. I tried your suggestion of using the GROUP function, but I couldn't keep the embedded documents separate or was doing something incorrectly.

I needed to pull/get a subset of embedded documents by ID. Here's how I did it using Map/Reduce:

db.parent.mapReduce(
  function(parent_id, child_ids){
    if(this._id == parent_id) 
      emit(this._id, {children: this.children, ids: child_ids})
  }, 
  function(key, values){
    var toReturn = [];

    values[0].children.forEach(function(child){
      if(values[0].ids.indexOf(product._id.toString()) != -1)
        toReturn.push(child);
    });
    return {children: toReturn};
  }, 
  { 
     mapparams: [
       "4d93b112c68c993eae000001", //example parent id
       ["4d97963ec68c99528d000007", "4debbfd5c68c991bba000014"] //example embedded children ids
     ]
  }
).find()

I've abstracted my collection name to 'parent' and it's embedded documents to 'children'. I pass in two parameters: The parent document ID and an array of the embedded document IDs that I want to retrieve from the parent. Those parameters are passed in as the third parameter to the mapReduce function.

In the map function I find the parent document in the collection (which I'm pretty sure uses the _id index) and emit its id and children to the reduce function.

In the reduce function, I take the passed in document and loop through each of the children, collecting the ones with the desired ID. Looping through all the children is not ideal, but I don't know of another way to find by ID on an embedded document.

I also assume in the reduce function that there is only one document emitted since I'm searching by ID. If you expect more than one parent_id to match, than you will have to loop through the values array in the reduce function.

I hope this helps someone out there, as I googled everywhere with no results. Hopefully we'll see a built in feature soon from MongoDB, but until then I have to use this.

Fadi, as for "keeping embedded documents separate" - group should handle this with no issues

function getServiceData(collection, criteria) {

    var res=db[collection].group({
        cond: criteria,
        initial: {vals:[],globalVar:0},
        reduce: function(doc, out) {
            if (out.globalVar%2==0)
                out.vals.push({doc.whatever.kind.and.depth);
                out.globalVar++;
        },
        finalize: function(out) {
            if (vals.length==0)
                out.vals='sorry, no data';
            return out.vals;
        }
    });

    return res[0];
};
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top