Вопрос

I'm currently in the way of implementing Trending Topics to my Rails application.

What I currently have is this:

Each post has topic attribute of 2 to 3 words describing its topic.

Then I get the top posts by their view count (I have also likes & favorites available, but for the time being using simply views):

def trending_topics
  Post.order("COALESCE(impressions_count, 0) DESC").limit(200)
end

And then what I do is to simply choose only unique topics and display a number of them:

  <% trending_topics.select(:topic).map(&:topic).uniq.take(10).each do |topic| %>
      <li><%= topic %></li>
  <% end %>

My questions are:

  1. Is there a way to get most frequently appearing :topic, rank them, and pick the cream of the crop of those?
  2. Is this a sustainable way to keep track of popular topics? If not, is there a way to make it more efficient?
  3. Is there a better way to implement a function that searches for the most popular and frequent :topic attributes in posts?
Это было полезно?

Решение

To answer your questions:

(1) Yes, you can get a hash with the frequency of each :topic like so:

array = trending_topics.select(:topic).map(&:topic)
freq = array.inject(Hash.new(0)) { |h,v| h[v] += 1; h }
# => {'topic1'=>3, 'topic2'=>3, 'topic3'=>1, ...}

(2) This is "sustainable" in the sense that it does not grow in complexity as the number of posts/topics you introduce is increased (because you are still sampling from the top 200 posts, though getting the "top 200" will take slightly more compute time as the number of posts grows).

(3) I would think that impressions_count would not be a very good way to keep track of what is trending, since to me impressions_count has the total number of impressions, while you want some temporal aspect of it (e.g. impressions_this_week).

So one way to do it would be to introduce an impressions_this_week column that is updated at regular intervals. Then you can choose based on that.

Another way would be to write a method that uses the overall impressions_count along with the created_at or updated_at timestamp to calculate how "hot" the post is. You could do this with a simple decay function and then tweak the constants of that function until you get the decay that you want. There is something similar to this concept shown here: http://blog.notdot.net/2009/12/Most-popular-metrics-in-App-Engine. Once you have that method written, you can just sort based on its output.

Другие советы

If you need something a bit more sophisticated than your current ranking algorithm you should probably have a look at how sites like reddit and hackernews handle this issue. It is fairly complicated but you should be able to find an implementation of each algorithm in Ruby if you google it.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top