Question

I'll show you 2 scenarios (N.B. d=damping factor=0.5)

First scenario : suppose to have 4 nodes A, B, C, D :

  • B, C, D link on A.

PageRank is : PR(A)=0.5 + 0.5*(PR(B)+PR(C)+PR(D))

I can resolve this equation by putting 0.25 on PR(B)=PR(C)=PR(D) and i'll get 0.875as value. I don't need to resolve any system

Second scenario : suppose to have 4 nodes A, B, C, D :

  • A link on B and C
  • B link on C
  • C link on A

In this way PageRank will be :

PR(A)=0.5 + 0.5 * PR(C)

PR(B)=0.5 + 0.5 * ((PR(A))/(2))

PR(C)=0.5 + 0.5 * ((PR(A))/(2) + PR(B))

I must to resolve this system to get the result. I don't put the 1/N on PR(A), PR(B), PR(C) and PR(D)

In fact, i search on internet the solution and the values are :

$PR(A) = 14/13 = 1.07692308$

$PR(B) = 10/13 = 0.76923077$

$PR(C) = 15/13 = 1.15384615$

So why with two similar scenarios i use 2 different behaviour?

Hope someone can help me :) Cheers

Was it helpful?

Solution

The two scenarios are different because of the symmetry in the first problem: B, C and D link to and are linked from the same pages (i.e. they all point to A and nothing points to them). Therefore their page rank will be the same, this gives you the extra constraint that PR(B)=PR(C)=PR(D), enabling you to solve the problem easily.

The second problem has no symmetry and has to be solved long hand.

OTHER TIPS

Assume a small universe of four web pages: A, B, C and D. Links from a page to itself, or multiple outbound links from one single page to another single page, are ignored. PageRank is initialized to the same value for all pages. In the original form of PageRank, the sum of PageRank over all pages was the total number of pages on the web at that time, so each page in this example would have an initial PageRank of 1. However, later versions of PageRank, and the remainder of this section, assume a probability distribution between 0 and 1. Hence the initial value for each page is 0.25.

The PageRank transferred from a given page to the targets of its outbound links upon the next iteration is divided equally among all outbound links.

If the only links in the system were from pages B, C, and D to A, each link would transfer 0.25 PageRank to A upon the next iteration, for a total of 0.75.

PR(A)= PR(B) + PR(C) + PR(D)

Suppose instead that page B had a link to pages C and A, page C had a link to page A, and page D had links to all three pages. Thus, upon the next iteration, page B would transfer half of its existing value, or 0.125, to page A and the other half, or 0.125, to page C. Page C would transfer all of its existing value, 0.25, to the only page it links to, A. Since D had three outbound links, it would transfer one third of its existing value, or approximately 0.083, to A. At the completion of this iteration, page A will have a PageRank of 0.458.

PR(A)= \frac{PR(B)}{2}+ \frac{PR(C)}{1}+ \frac{PR(D)}{3}.\,

In other words, the PageRank conferred by an outbound link is equal to the document's own PageRank score divided by the number of outbound links L( ).

PR(A)= \frac{PR(B)}{L(B)}+ \frac{PR(C)}{L(C)}+ \frac{PR(D)}{L(D)}.

In the general case, the PageRank value for any page u can be expressed as:

PR(u) = \sum_{v \in B_u} \frac{PR(v)}{L(v)},

i.e. the PageRank value for a page u is dependent on the PageRank values for each page v contained in the set Bu (the set containing all pages linking to page u), divided by the number L(v) of links from page v.

For further queries visit here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top