Вопрос

I'm working on a very high throughput site with many items, am looking into implementing "trending now" type functionality, that would allow users to quickly get a prioritized list of the top N items that have been viewed recently by many people, that gradually fade away as they get fewer views.

One idea about how to do this is to give more weight to recent views of an item, something like a weight of 16 for every view of an item the past 15 minutes, a weight of 8 for every view of an item in the past 1 hour, a weight of 4 for things in the past 4 hours, etc but I do not know if this is the right way to approach it.

I'd like to do this in Redis, we've had good success with Redis in the past for other projects.

What is the best way to do this, both technologically and the determination of what is trending?

The first answer hints at a solution but I'm looking for more detail -- starting a bounty.

These are both decent ideas, but not quite detailed enough. One got half the bounty but leaving the question open.

Это было полезно?

Решение

So, I would start with a basic time ordering (zset of item_id scored by timestamp, for example), and then float things up based on interactions. So you might decided that a single interaction is worth 10 minutes of 'freshness', so each interaction adds that much time to the score of the relevant item. If all interactions are valued equally, you can do this with one zset and just increment the scores as interactions occur.

If you want to have some kind of back-off, say, scoring by the square root of the interaction count instead of the interaction count directly, you could build a second zset with your score for interactions, and use zunionstore to combine this with your timestamp index. For this, you'll probably want to pull out the existing score, do some math on it and put a new score over it (zadd will let you overwrite a score)

The zunionstore is potentially expensive, and for sufficiently large sets even the zadd/zincrby gets expensive. To this end, you might want to keep only the N highest scoring items, for N=10,000 say, depending on your application needs.

Другие советы

The Reddit Ranking algorithm does a pretty good job of what you describe. A good write up here that talks through how it works.

https://medium.com/hacking-and-gonzo/how-reddit-ranking-algorithms-work-ef111e33d0d9

consider an ordered set with the number of views as the scores. whenever an item is accessed, increment its score (http://redis.io/commands/zincrby). this way you can get top items out of the set ordered by scores.

you will need to "fade" the items too, maybe with an external process that would decrement the scores.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top