Domanda

I have the following code in my rails app.

module UserItem
  class Rating
    include MongoMapper::Document
    key :user_id, Integer, :required => true
    key :item_id,  Integer, :required => true
    key :rating, Float, :required => true 
  end
end

And I have about 10K users and 10K items and i need to store rating of each user for each item, which is about 10^8 records. I have computed the values of 10^8 records into an array as follows

ratings = [
  {user_id: 1, item_id: 1, rating: 1.5}, 
  {user_id: 1, item_id: 2, rating: 3.5},
  ... and so on 10^8 records
]

Now, I need to insert all these 10^8 records computed into mongo. I tried with

UserItem::Rating.collection.insert(ratings)

and

UserItem::Rating.create(ratings)

But it takes hours together to insert the 10^8 records into mongo. Is there any better/efficient way to insert records into mongo?

Context: I am using it more like a cache store which stores all rating values. When I display list of items, I will just read from this cache and display the rating provided by the user alongside each item.

Any help is much appreciated!

È stato utile?

Soluzione

One approach is to store one document per user, with a ratings field that is a hash of item ids to users, for example

class UserRating
  include MongoMapper::Document
  key :ratings
  key :user_id
end

UserRating.create(:user_id => 1, :ratings => {"1" => 4, "2" => 3})

You have to use string keys for the hash. This approach doesn't make it easy to retrieve all the ratings for a given document - if you do that a lot it might be easier to store a document per item instead. It's also probably not very efficient if you only ever need a small proportion of a user's ratings at a time.

Obviously you can combine this with other approaches to increasing write throughput, such as batching your inserts or sharding your database.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top