Question

I have "feeds" collection, each feed have comments. So when someone comments on feed, he is added to "subsribers" which is Mongo's multikey field.

feeds: {
 _id: ...,
 text: "Text",
 comments: [{by: "A", text: "asd"},{by: "B", text: "sdf"}],
 subscribers: ["A","B"]

}

Then when I need to get all feeds with new comments for user A, I ask for feeds with {subscribers: "A"}.

Usually there're 2-5 comments, but sometimes (on hot feeds) there might be >100 comments and >100 subscribers.

I know that it's not recommended to have multikey fields with too many keys. So how much is too much?

I ask because I need to decide - if I will use multikeys or is it better to send comments directly to each user. In this case I have to copy feed for each subscriber - and collection will grow VERY fast - which I think isn't good also: 1000 user, each followed by 10 users, each making 10 actions a day = 1 000 000 records every 10 days!

Was it helpful?

Solution

Although you may experience problems with really large documents, particularly if MongoDB must scan the entire document to fulfill a query, as you would expect; arrays with large numbers of values are not in and of themselves problematic in MongoDB, even if they're multi-key indexes.

There is one caveat: the index will not store keys (in the case of a multi-key document, this is the item in the array,) longer than 1024 bytes. As long as the items in the array are shorter than this limit, you should be fine.

Having said this, you do want to avoid data models where an array or other part of the document will grow unbounded and forever. While MongoDB adds a little bit of padding on disk for every document, if the document grows substantially after creation, the database must move it to another location on disk. However you decide to model your data, ensure that your documents do not tend to grow much after creation.

Reference:

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top