Question

I have a bson file: xyz.bson full of useful data and I'd like to query/process the data using python. Is there a simple example/tutorial out there I can get started with?

I don't understand this one.

Was it helpful?

Solution

You could use the mongorestore command to import the data into a mongoDB server and then query it by connecting to that server.

OTHER TIPS

If you want to stream the data as though it were a flat JSON file on disk rather than loading it into a mongod, you can use this small python-bson-streaming library:

https://github.com/bauman/python-bson-streaming

from bsonstream import KeyValueBSONInput
from sys import argv
for file in argv[1:]:
    f = open(file, 'rb')
    stream = KeyValueBSONInput(fh=f,  fast_string_prematch="somthing") #remove fast string match if not needed
    for id, dict_data in stream:
        if id:
         ...process dict_data...

You may use sonq to query .bson file directly from bash, or you can import and use the lib in Python.

A few examples:

  • Query a .bson file sonq -f '{"name": "Stark"}' source.bson

  • Convert query results to a newline separated .json file sonq -f '{"name": {"$ne": "Stark"}}' -o target.json source.bson

  • Query a .bson file in python from sonq.operation import query_son record_list = list(query_son('source.bson', filters={"name": {"$in": ["Stark"]}}))

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top