"A pipeline stage specification object must contain exactly one field" when using OrderedDict

StackOverflow https://stackoverflow.com/questions/23112221

  •  04-07-2023
  •  | 
  •  

Question

I try to run an aggregate command:

request = collections.OrderedDict([
        ("$unwind", "$tags" ),
        ("$group", { "_id" : "$tags" , "count" : { "$sum" : 1 }  } ),
        ("$project", { "_id" : 0, "tag" : "$_id" , "count" : 1 } ),
        ("$sort", { "count" : -1 } ),
        ("$limit", 3 )])

print client.devoxx.talks.aggregate(request)

But MongoDB rejects it:

pymongo.errors.OperationFailure: command SON([('aggregate', u'talks'), ('pipeline', [OrderedDict([('$unwind', '$tags'), ('$group', {'count': {'$sum': 1}, '_id': '$tags'}), ('$project', {'count': 1, '_id': 0, 'tag': '$_id'}), ('$sort', {'count': -1}), ('$limit', 3)])])]) failed: exception: A pipeline stage specification object must contain exactly one field.

It seems to me that I have each aggregate stage in one item of the ordered dict.

Was it helpful?

Solution

This is probaby very pymongo specific but it is also very unnecessary as the standard form of arguments for an aggregation pipeline is actually an array and may as well be simply specified like so, for example:

request = [{ "$unwind": "$tags"}, { "$group": { "_id": "$tags" } }]

Which is always going to serialize in order and as such presents no problem.

So there is no need to use an OrderedDict.

You are perhaps confusing the behavior with recent changes to the mongo shell (from 2.6) that allows arguments to be specified without wrapping in an array. But JSON as with some other languages expects their "dictionary/hash" definitions to maintain their specified order.

So using an array/list syntax is still the preferred implementation.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top