Question

I have been researching how to build a newsfeed that sorts the order by relevancy.

I have been following along Facebook's Edge Rank and Etsy's Activity Feed Architecture and I can't figure out how these companies calculate Affinity.

The database stores a percentage value for example 0.75

Given I have a set of activity items created.

User X commented on User Y User X liked User Y

How would you get an affinity percentage out of this?

Was it helpful?

Solution

As I understand your question, you are specifically asking why / how do we need to calculate percentage and not absolute affinity scores?

The EdgeRank algorithm for choosing the priority of Social Stories is the same problem (theoretically) as choosing a page for a Search Result in PageRank which is the same as choosing an Ad out of many for a particular search query.

Take the case of PageRank. According to Wikipedia:

Wikipedia Diagram for PageRank Mathematical PageRanks for a simple network, expressed as percentages. (Google uses a logarithmic scale.) Page C has a higher PageRank than Page E ...

Also see this -> http://www2007.org/posters/poster893.pdf

Similarly, I suspect the real reason for using percentages is to normalize the EdgeRank so that all news Feed stories are brought to the same scale.

Taking your example lets say there are 2 pages P1 and P2 our User called A has liked. Now there are Updates from both. If Facebook's affinity formula for pages is (hypothetically) this ->

popularity of page * frequency of user interaction

Or

number of page likes / time since last interaction from user A.

Say for Page P1 this is 100000 / 20 = 50000 and for Page P2 it is 2000 / 10 = 200.

So Page P1's news story wins because it is a more popular page and its story will be shown before P2.

But this also has to compete with stories from users where the formula can be totally different say

Number of Mutual Friends * Number of Posts shared on their Wall

For a post from another user B this can be 1000 * 10 = 10000 but when this value competes with P1 it loses in absolute number. But having 1000 mutual friends is a really big thing! Technically B's story should appear before P1's.

A solution to this would be to normalize all affinity scores between 0 to 1 so that the competition is fair and relative.

Now the 2nd part of the EdgeRank formula has Edge Weight which is simlar to the above concept. But Edge Weight specifically focusses on the type of interaction, like , comment, share etc.

All these "Competition Algorithms" EdgeRank, PageRank etc. are called Recommendation Systems. These algorithms must normalize scores to scale in order to show relevant order of results!

I'll keep adding to the question if I find something more!

Licensed under: CC-BY-SA with attribution
scroll top