embed vs link on large mongoDB sets (ruby)
-
29-09-2019 - |
Question
Embedded vs link
I'm looking for the fastest way to search a Newsletter document for a connected Email. So far I have used MongoMapper with one document for Newsletter and another for Email. This is getting really slow with +100k Emails.
I was thinking maybe its faster to embed the emails in an array inside Newsletter since I'm really only interested in the email ('someemail@email.com') and not any logic around it.
1) Is it possible at all to embed as much as 100k-500k emails in one document? 2) Is Mongoid better/faster for this?
I'm adding the email if it is not already in the collection by asking
email = newsletter.emails.first(:email => 'someemail@email.com')
unless email
email = Email.new(:email => 'someemail@email.com', :newsletter_id => self.id)
email.save
end
And I think this is where it all starts to hurt.
Here is how they are connected Class Newsletter include MongoMapper::Document many :emails ... end
Class Email
include MongoMapper::Document
key :email, String
key :newsletter_id, ObjectId
belongs_to :newsletter
end
would love for any help on this :)
Solution
There is a maximum document size of 16mb currently for MongoDB, MongoMapper or Mongoid will make no difference to this.
see http://www.mongodb.org/display/DOCS/Documents
Embedded documents should be considerably quicker though, if you can fit all the emails within the limit could be a squeeze.
If storing the whole email is to much, why not just store either an array or just embedded the emails address withing the newsletter with a reference to the full email.
You can then get the speed advantage you want, and keep the emails accessible outside of the newsletter.