سؤال

We all know microscope from discovery meteor. The app is fine, it operates only with the number of upvotes. In the Best page it sorts posts by upvotes in descending order, upvote number is stored for each post and gets updated each time some user upvotes a post.

Imagine now we want to implement something like hacker news have - not only a click-based rating, but also a time-based rating. Lets now define that I will use word 'click' to describe an user action of clicking on post in the post list. This 'click' increases total number of clicks of this post by 1.

For thouse who do not knowe how hacker news algorithm work I will briefly explain. In common the total number of clicks of certain link (post) is divided by:

(T+2)^g

where T - total number of hours passed since post publishing time and now, and g is a "sensitivity" thing, lets call it that, which is just a number, 1.6, or 1.8, doesn't matter. This decrease influence of clicks as the time goes by. You can read more info (http://amix.dk/blog/post/19574)[here], for example.

Now, we want to have top-50 click&time-rated posts, so we need to query mongo to find all posts, sorted by score, calculated with formula from above.

I can see two major approaches to do so, and I find all of them quite bad.

First one, (the way I do now) subscribe to all posts, in template hepler prepare data for rendering by

rankedPosts: function() {
  rawPosts = posts.find().map( function(item) { item.score = clicks/(T+2)^g; } ); // to add score for each post
  rawPosts = _.sortBy( rawPosts, function(item) { return item.score*(-1); }) // to sort them by calculated score
  rawPosts = _.first( rawPosts, 50 ); // to get only first 50
}

and then use rankedPosts for rendering. The bottleneck here is that each time I have to run through all posts.

Second one - somehow (I do not know how, or if it even possible) to subscribe for already scored/sorted/filtered collection, assuming meteor/mongodb can apply their magic to score/sort/filter (and recalculate score each new hour or new click) for me.

Now, obvious question, what will you recommend?

Thanks in advance.

هل كانت مفيدة؟

المحلول

Think about numbers. In a working page, you can have thousands of mosts, millions if the page is successful. Fetching all of them just to find the top 50 doesn't make sense.

I'd recommend storing the final calculated rating in a field. Then in subscription you apply sort by that field and desired limit. When post gain a new click, you simply recalculate the value and save it to db. Finally, in a cron job or meteor interval you update the rating of all items in the database.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top