How to group query according to boolean field and retrieve all records separately by boolean value in MongoDB?

StackOverflow https://stackoverflow.com/questions/22787650

Question

The following are some insert statements

db.users.insert({ courseId: 1, stDt: new Date(2014, 01, 01), endDt: new Date(2014, 01, 20), active: false }); 
db.users.insert({ courseId: 1, stDt: new Date(2014, 01, 25), endDt: new Date(2014, 02, 10), active: false }); 
db.users.insert({ courseId: 1, stDt: new Date(2014, 02, 25), endDt: new Date(2014, 03, 10), active: true }); 
db.users.insert({ courseId: 1, stDt: new Date(2014, 02, 28), endDt: new Date(2014, 06, 10), active: true }); 
db.users.insert({ courseId: 1, stDt: new Date(2014, 02, 25), endDt: new Date(2014, 02, 30), active: false }); 
db.users.insert({ courseId: 1, stDt: new Date(2014, 05, 25), endDt: new Date(2014, 10, 30), active: false }); 
db.users.insert({ courseId: 1, stDt: new Date(2013, 10, 01), endDt: new Date(2014, 08, 10), active: true }); 
db.users.insert({ courseId: 1, stDt: new Date(2014, 09, 01), endDt: new Date(2014, 11, 30), active: false }); 

Case 1: Can we create following resultset from above inserted values? Here all results with "active" field to true and false are sorted by "endDt" field retrieved seperately with single query.

{ "result" : { 
    true : [ 
      { endDt: new Date(2014, 03, 10), active: true }, 
      { endDt: new Date(2014, 06, 10), active: true },  
      { endDt: new Date(2014, 08, 10), active: true }, 
   ],
   false: [ 
      { endDt: new Date(2014, 01, 20), active: false }, 
      { endDt: new Date(2014, 02, 10), active: false }, 
      { endDt: new Date(2014, 02, 30), active: false }, 
      { endDt: new Date(2014, 10, 30), active: false }, 
      { endDt: new Date(2014, 11, 30), active: false } 
   ]
  }
} 

Is it possible do it by single query?

Case 2: Can we do as following resultset?

{ "result" : [ 
    { endDt: new Date(2014, 01, 20), active: false }, 
    { endDt: new Date(2014, 02, 10), active: false }, 
    { endDt: new Date(2014, 02, 30), active: false }, 
    [   
        { endDt: new Date(2014, 03, 10), active: true }, 
        { endDt: new Date(2014, 06, 10), active: true },    
        { endDt: new Date(2014, 08, 10), active: true } 
    ], 
    { endDt: new Date(2014, 10, 30), active: false }, 
    { endDt: new Date(2014, 11, 30), active: false } 
  ]
} 

Here sorted by endDt field, but last endDt record of active field true will have array with all previous "active" field true records in it.

I think, second case is not possible.

Was it helpful?

Solution 3

The Case 1 can return the exact resultset. But Case 2 is not valid document.

Actually answer was given by my friend Doug to me in his blog. I am just copying it to SO.

Here is query for Case 1 and it is explained by comments.

db.users.aggregate([    
    // Project only what we need.
    {
        "$project": {
            "endDt": 1,
            "active": 1,
            "_id": 0
        }
    },

    // Group to true and false buckets. This will give us arrays with null.
    // We'll remove them in a bit.
    {
        "$group": {
            "_id": "$active",
            "true": {
                "$addToSet": {
                    "$cond": [
                        {"$eq": ["$active", true]},
                        "$endDt",
                        null
                    ]
                }
            },
            "false": {
                "$addToSet": {
                    "$cond": [
                        {"$eq": ["$active", false]},
                        "$endDt",
                        null
                    ]
                }
            }
        }
    },

    // We need to unwind our arrays so we can build them back up without the
    // "null" array.
    {"$unwind": "$true"},
    {"$unwind": "$false"},

    // Project out the values. This will give both a true and false key for
    // each item. This builds our arrays up with the proper endDt and
    // active values.
    {
        "$project": {
            "true": {"endDt": "$true", "active": "$_id"}, 
            "false": {"endDt": "$false", "active": "$_id"}
        }
    },

    // Project out a single value to clean up the "issue" a couple steps above.
    {
        "$project": {
            "value": {
                "$cond": [
                    {"$eq": ["$_id", true]},
                    "$true",
                    "$false"
                ]
            }
        }
    },

    // Group things up again to rebuild our arrays.
    // This adds a single "null" entry that will need to be cleaned up.
    {
        "$group": {
            "_id": null,
            "true": {
                "$addToSet": {
                     "$cond": [
                         {"$eq": ["$_id", true]},
                         "$value",
                         null
                     ]
                }
            },
            "false": {
                "$addToSet": {
                    "$cond": [
                        {"$eq": ["$_id", false]},
                        "$value",
                        null
                    ]
                }
            }
        }
    },

    // Unwind our arrays again so we can clean up one more time.
    {"$unwind": "$true"},
    {"$unwind": "$false"},

    // Match only documents where true and false are not null.
    {
        "$match": {
            "true": {"$ne": null},
            "false": {"$ne": null}
        }
    },

    // Sort our items so we can add the to the array in the correct order.
    // I'm not sure why it has to be descending order, but it works.
    {
        "$sort": {
            "true.endDt": -1,
            "false.endDt": -1
        }
    },

    // Group again to build our array.
    {
        "$group": {
            "_id": null,
            "true": {"$addToSet": "$true"},
            "false": {"$addToSet": "$false"}
        }
    },

    // Once again project out just the fields we need
    {
        "$project": {
            "true": 1,
            "false": 1,
            "_id": 0
        }
    }
])

Here are the returned results in MongoDB 2.2.x - 2.4.x:

{
    "result" : [
        {
            "true" : [
                {
                    "endDt" : ISODate("2014-04-10T06:00:00Z"),
                    "active" : true
                },
                {
                    "endDt" : ISODate("2014-07-10T06:00:00Z"),
                    "active" : true
                },
                {
                    "endDt" : ISODate("2014-09-10T06:00:00Z"),
                    "active" : true
                }
            ],
            "false" : [
                {
                    "endDt" : ISODate("2014-02-20T07:00:00Z"),
                    "active" : false
                },
                {
                    "endDt" : ISODate("2014-03-10T06:00:00Z"),
                    "active" : false
                },
                {
                    "endDt" : ISODate("2014-03-30T06:00:00Z"),
                    "active" : false
                },
                {
                    "endDt" : ISODate("2014-11-30T07:00:00Z"),
                    "active" : false
                },
                {
                    "endDt" : ISODate("2014-12-30T07:00:00Z"),
                    "active" : false
                }
            ]
        }
    ],
    "ok" : 1
}

And in the soon to be released version 2.6.0, the results look like the following:

{
    "true" : [
        {
            "endDt" : ISODate("2014-04-10T06:00:00Z"),
            "active" : true
        },
        {
            "endDt" : ISODate("2014-07-10T06:00:00Z"),
            "active" : true
        },
        {
            "endDt" : ISODate("2014-09-10T06:00:00Z"),
            "active" : true
        }
    ],
    "false" : [
        {
            "endDt" : ISODate("2014-02-20T07:00:00Z"),
            "active" : false
        },
        {
            "endDt" : ISODate("2014-03-10T06:00:00Z"),
            "active" : false
        },
        {
            "endDt" : ISODate("2014-03-30T06:00:00Z"),
            "active" : false
        },
        {
            "endDt" : ISODate("2014-11-30T07:00:00Z"),
            "active" : false
        },
        {
            "endDt" : ISODate("2014-12-30T07:00:00Z"),
            "active" : false
        }
    ]
}

OTHER TIPS

The first one is simple, you do something like this:

db.users.aggregate({$sort:{endDt:1}}, {$group:{_id:"$active", dates:{$push:"$endDt"}}})
{
    "_id" : true,
    "dates" : [
        ISODate("2014-04-10T07:00:00Z"),
        ISODate("2014-07-10T07:00:00Z"),
        ISODate("2014-09-10T07:00:00Z")
    ]
},
{
    "_id" : false,
    "dates" : [
        ISODate("2014-02-20T08:00:00Z"),
        ISODate("2014-03-10T07:00:00Z"),
        ISODate("2014-03-30T07:00:00Z"),
        ISODate("2014-11-30T08:00:00Z"),
        ISODate("2014-12-30T08:00:00Z")
    ]
}

The second one is probably doable but you need to define more precisely what exactly you want returned.

You can use group command as such...

db.users.aggregate({'$group' : {'_id' : '$active','endDt' : {'$addToSet' : '$endDt'}}})
{
        "result" : [
                {
                        "_id" : true,
                        "endDt" : [
                                ISODate("2014-09-09T21:00:00Z"),
                                ISODate("2014-07-09T21:00:00Z"),
                                ISODate("2014-04-09T21:00:00Z")
                        ]
                },
                {
                        "_id" : false,
                        "endDt" : [
                                ISODate("2014-12-29T21:00:00Z"),
                                ISODate("2014-11-29T21:00:00Z"),
                                ISODate("2014-03-29T21:00:00Z"),
                                ISODate("2014-03-09T21:00:00Z"),
                                ISODate("2014-02-19T21:00:00Z")
                        ]
                }
        ],
        "ok" : 1
}

you can read more at mongo docs

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top