Question

Embedded vs link

I'm looking for the fastest way to search a Newsletter document for a connected Email. So far I have used MongoMapper with one document for Newsletter and another for Email. This is getting really slow with +100k Emails.

I was thinking maybe its faster to embed the emails in an array inside Newsletter since I'm really only interested in the email ('someemail@email.com') and not any logic around it.

1) Is it possible at all to embed as much as 100k-500k emails in one document? 2) Is Mongoid better/faster for this?

I'm adding the email if it is not already in the collection by asking

email = newsletter.emails.first(:email => 'someemail@email.com')
unless email
    email = Email.new(:email => 'someemail@email.com', :newsletter_id => self.id)
    email.save
end

And I think this is where it all starts to hurt.

Here is how they are connected Class Newsletter include MongoMapper::Document many :emails ... end

Class Email
   include MongoMapper::Document
   key :email, String
   key :newsletter_id, ObjectId
   belongs_to :newsletter
end

would love for any help on this :)

Was it helpful?

Solution

There is a maximum document size of 16mb currently for MongoDB, MongoMapper or Mongoid will make no difference to this.

see http://www.mongodb.org/display/DOCS/Documents

Embedded documents should be considerably quicker though, if you can fit all the emails within the limit could be a squeeze.

If storing the whole email is to much, why not just store either an array or just embedded the emails address withing the newsletter with a reference to the full email.

You can then get the speed advantage you want, and keep the emails accessible outside of the newsletter.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top