Question

The problem that I am facing is that BSON comes with ObjectId and Timestamp which are not supported in Messagepack and it aint possible to define a custom serializer for Messagepack (at least as far as I know).
I wrote a piece of python code to compare pymongo's BSON vs msgpack. With not much of optimization I could achieve 300% performance improvement. So, is there any way to convert BSON to Messagepack?

Was it helpful?

Solution

Here is how I solved the problem.
Unfortunately since mongodb none-REST API doesn't come with a Strict, or JS mode for document retrieval (as opposed to its REST API in which you could specify the format you wanna use to retrieve a document), we are left with no option but to do the conversion manually.

import json    
from bson import json_util
import msgpack

con = Connection()
db = con.test
col = db.collection
d = col.find().limit(1)[0]

s = json.dumps(d, default=json_util.default) # s is in JSON compatibale format (ObjcetId => '$0id'
packer= msgpack.Packer()
packer.pack(s) # messagepack can successfully convert since the format is JSON compatible.

The awesome observation is that even with one extra step of json.dumps, Messagepack serializer is faster than BSON encode, not 3 times though. For 10000 repetition the difference is three tenth of a second.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top