Question

I am trying to store word frequency data using Mongo. Each word needs to be associated to a user so I can calculate how often an individual uses each word. Currently my words collection looks like this:

{'Hello':3, 'user_id':1}

Which obviously only works on a 'One To One' basis and is no good.

I am trying to work out how best to make this a 'One To Many' relationshop between the user and the words. Would I store the user relationship in my words collection like so:

{'word':"Hello", 'users':[{'id':1, 'count':4},{'id':2, 'count':10}]}

Or would I attach the word counts to the user collection instead?

{'id':1, 'username':'SomeUser', 'words':[{'Hello':4}]}

The obvious disadvantage to the second approach is that the same words will be used across different users, so having a single words collection would help to keeping the data size down.

Can anyone advise me as to what I should do here? Is there a method I have perhaps overlooked in the documentation?

Was it helpful?

Solution

The obvious disadvantage to the second approach is that the same words will be used across different users, so having a single words collection would help to keeping the data size down.

Nope, that's the nature of using document db. Data size is really not a matter in non sql solutions, important thing is how easy and how fast you can access your data.

Your first approach is a typical textbook relational model. There is no advantage of using this in mongo (Though you can model this in relational way in mongo). Instead the second approach gives you

  • Fatser reads/writes since every word is stored inside user. You dont need to perform multiple queries for this
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top