There is a rule you must follow when writing map-reduce code in MongoDB (a few rules, actually). One is that the emit (which emits key/value pairs) must have the same format for the value that your reduce function will return.
If you emit(this.key, this.value)
then reduce must return the exact same type that this.value
has. If you emit({},1)
then reduce must return a number. If you emit({},{category: this.category})
then reduce must return the document of format {category:"string"}
(assuming category is a string).
So that clearly can't be what you want, since you want totals, so let's look at what reduce is returning and work out from that what you should be emitting.
It looks like at the end you want to accumulate a document where there is a keyname for each category and its value is a number representing the number of its occurrences. Something like:
{category_name1:total, category_name2:total}
If that's the case then the correct map function would emit({},{"this.category":1})
in which case your reduce will need to add up the numbers for each key corresponding to a category.
Here is what the map should look like:
map=function (){
category = { };
category[this.category]=1;
emit({},category);
}
And here is the correct corresponding reduce:
reduce=function (key,values) {
var category_count = {};
values.forEach(function(value){
for (cat in value) {
if( !category_count.hasOwnProperty(cat) ) category_count[cat]=0;
category_count[cat] += value[cat];
}
});
return category_count;
}
Note that it satisfies two other requirements for MapReduce - it works correctly if the reduce function is never called (which will be the case if there is only one document in your collection) and it will work correctly if the reduce function gets called multiple times (which is what's happening when you have more than 100 documents).
A more conventional way to do that would be to emit category name as key and the number as value. This simplifies map and reduce:
map=function() {
emit(this.category, 1);
}
reduce=function(key,values) {
var count=0;
values.forEach(function(val) {
count+=val;
}
return count;
}
This will sum the number of times each category appears. This also satisfies requirements for MapReduce - it works correctly if the reduce function is never called (which will be the case for any category that only appears once) and it will work correctly if the reduce function gets called multiple times (which will happen if any category appears more than 100 times).
As others pointed out, aggregation framework makes the same exercise much simpler with:
db.collection.aggregate({$group:{_id:"$category",count:{$sum:1}}})
although that matches the format of the second mapReduce I showed, and not the original format that you had which is outputting category names as keys. However aggregation framework will always be significantly faster than MapReduce.