Domanda

I have a large Mongo collection from which I'd like to dump a subset to copy to a staging server for testing purposes. This collection includes fields that are ObjectIDs of GridFS files. I can get the subset of the collection easily enough using mongodump's --query flag, but I can't figure out any easy way to also dump just the GridFS files and chunks referenced by the matching records in the main collection. What would be the least-painful way to accomplish this?

(It wouldn't especially surprise me if there's just not any straightforward way to do the export using only Mongo's command-line tools, so if that's the case, I'd also be interested in a way to do the export programmatically, but produce output that could be imported with standard tools like mongorestore. Python's mongo drivers are the ones with which I'm most comfortable, but I'm not picky.)

È stato utile?

Soluzione

There's currently nothing built-in to do this and your best option is to write a python script.

It's best not use mongodump. Write the python script to read from the original server and insert into the staging server. If you are doing it at a document level for each gridfs file do the chunks first, then files doc. If you are using the python gridfs class, just read from the original server and save to the staging server.

See the PyMongo GridFS documentation.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top