Question

I am playing around with MongoDB trying to figure out how to do a simple

SELECT province, COUNT(*) FROM contest GROUP BY province

But I can't seem to figure it out using the aggregate function. I can do it using some really weird group syntax

db.user.group({
    "key": {
        "province": true
    },
    "initial": {
        "count": 0
    },
    "reduce": function(obj, prev) {
        if (true != null) if (true instanceof Array) prev.count += true.length;
        else prev.count++;
    }
});

But is there an easier/faster way using the aggregate function?

Was it helpful?

Solution

This would be the easier way to do it using aggregate:

db.contest.aggregate([
    {"$group" : {_id:"$province", count:{$sum:1}}}
])

OTHER TIPS

I need some extra operation based on the result of aggregate function. Finally I've found some solution for aggregate function and the operation based on the result in MongoDB. I've a collection Request with field request, source, status, requestDate.

Single Field Group By & Count:

db.Request.aggregate([
    {"$group" : {_id:"$source", count:{$sum:1}}}
])

Multiple Fields Group By & Count:

db.Request.aggregate([
    {"$group" : {_id:{source:"$source",status:"$status"}, count:{$sum:1}}}
])

Multiple Fields Group By & Count with Sort using Field:

db.Request.aggregate([
    {"$group" : {_id:{source:"$source",status:"$status"}, count:{$sum:1}}},
    {$sort:{"_id.source":1}}
])

Multiple Fields Group By & Count with Sort using Count:

db.Request.aggregate([
    {"$group" : {_id:{source:"$source",status:"$status"}, count:{$sum:1}}},
    {$sort:{"count":-1}}
])

If you need multiple columns to group by, follow this model. Here I am conducting a count by status and type:

  db.BusinessProcess.aggregate({
    "$group": {
        _id: {
            status: "$status",
            type: "$type"
        },
        count: {
            $sum: 1
        }
    }
   })

Starting in MongoDB 3.4, you can use the $sortByCount aggregation.

Groups incoming documents based on the value of a specified expression, then computes the count of documents in each distinct group.

https://docs.mongodb.com/manual/reference/operator/aggregation/sortByCount/

For example:

db.contest.aggregate([
    { $sortByCount: "$province" }
]);

Additionally if you need to restrict the grouping you can use:

db.events.aggregate( 
    {$match: {province: "ON"}},
    {$group: {_id: "$date", number: {$sum: 1}}}  
)

This type of query worked for me:

 db.events.aggregate({$group: {_id : "$date", number:  { $sum : 1} }} )

See http://docs.mongodb.org/manual/tutorial/aggregation-with-user-preference-data/

Starting in Mongo 5.0, we can also use { $count: { } } as an alias for { $sum : 1 }:

// { "province" : "Champagne-Ardenne" }
// { "province" : "Champagne-Ardenne" }
// { "province" : "Haute-Normandie"   }
db.collection.aggregate([
  { $group: { _id: "$province", count: { $count: {} } } }
])
// { "_id" : "Champagne-Ardenne", "count" : 2 }
// { "_id" : "Haute-Normandie",   "count" : 1 }
    db.contest.aggregate([
        { $match:{.....May be some match criteria...}},
        { $project: {"province":1,_id:0}},
        { $sortByCount: "$province" }
    ],{allowDiskUse:true});

MongoDB have 32 MB limitation of sorting operation on memory, use allowDiskUse : true this option, when you expose this query upfront of millions of data, it will sort at disk level not in memory. MongoDB aggregation pipeline has 100MB limitation, so use $project to reduce the data flowing to next pipeline. If you are using small data then no need to use allowDiskUse option.

Mongo shell command that worked for me:

db.getCollection(<collection_name>).aggregate([{"$match": {'<key>': '<value to match>'}}, {"$group": {'_id': {'<group_by_attribute>': "$group_by_attribute"}}}])
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top