BSON to Messagepack
-
14-07-2021 - |
Question
The problem that I am facing is that BSON comes with ObjectId and Timestamp which are not supported in Messagepack and it aint possible to define a custom serializer for Messagepack (at least as far as I know).
I wrote a piece of python code to compare pymongo's BSON vs msgpack. With not much of optimization I could achieve 300% performance improvement.
So, is there any way to convert BSON to Messagepack?
Solution
Here is how I solved the problem.
Unfortunately since mongodb none-REST API doesn't come with a Strict, or JS mode for document retrieval (as opposed to its REST API in which you could specify the format you wanna use to retrieve a document), we are left with no option but to do the conversion manually.
import json
from bson import json_util
import msgpack
con = Connection()
db = con.test
col = db.collection
d = col.find().limit(1)[0]
s = json.dumps(d, default=json_util.default) # s is in JSON compatibale format (ObjcetId => '$0id'
packer= msgpack.Packer()
packer.pack(s) # messagepack can successfully convert since the format is JSON compatible.
The awesome observation is that even with one extra step of json.dumps, Messagepack serializer is faster than BSON encode, not 3 times though. For 10000 repetition the difference is three tenth of a second.