Question

I have a local MongoDB database that I am starting to put some files into GridFS for caching purposes. What I want to know is:

Can I use db.cloneCollection() on another server to clone my fs.* collections? If I do that will the GridFS system on that server work properly? Essentially I have to "pull" data from another machine that has the files in GridFS, I can't direcly add them easily to the production box.

Edit: I was able to get on my destination server and use the following commands from the mongo shell to pull the GridFS system over from another mongo system on our network.

use DBName
db.cloneCollection("otherserver:someport","fs.files")
db.cloneCollection("otherserver:someport","fs.chunks")

For future reference.

Was it helpful?

Solution

The short answer is of course you can, it is only a collection and there is nothing special about it at all. The longer form is explaining what GridFS actually is.

So the very first sentence on the manual page:

GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16MB.

GridFS is not something that "MongoDB does", internally to the server it is basically just two collections, one for the reference information and one for the "chunks" that are used to break up the content so no individual document exceeds the 16MB limit. But most importantly here is the word "specification".

So the server itself does no magic at all. The implementation to store reference data and chunks is all done at the "driver" level, where in fact you can name the collections you wish to use rather than just accept the defaults. So when reading and writing data, it is the "driver" that does the work by pulling the "chunks" contained in the reference document or creating new "chunks" as data is sent to the server.

The other common misconception is that GridFS is the only method for dealing with "files" when sending content to MongoDB. Again in that first sentence, it actually exists as a way to store content that exceeds the 16MB limit for BSON documents.

MongoDB has no problem directly storing binary data in a document as long as the total document does not exceed the 16MB limit. So in most use cases ( small image files used on websites ) the data would be better stored in ordinary documents and thus avoid the overhead of needing to read and write with multiple collections.

So there is no internal server "magic". These are just ordinary collections that you can query, aggregate, mapReduce and even copy or clone.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top