Question

I have two tabled in my Mysql database

table1 has the all webpages in my network

         | table1: (pages)|
         |----------------|
         | id   | url     |
         |----------------|

table2 has two fields, which are the source page of the link and the destination page of the link

          |---------------------------|
          |table2(links)              |
          |---------------------------|
          |from_page_id   | to_page_id|
          |----------------------------

How to calculate the page rank for my network

I have found this article here it explains the PageRank algorithm but it is very difficult to write their formula in PHP + I am not good at math

Thanks

update:

I have almost 5000 pages in my network

Was it helpful?

Solution

HI again

I think I have figured out how to do it but I am not sure

I will till you and you judge if my way in calculation the pagerank is correct or not

first I have added a new column to the "pages" table a called it "outgoinglinks" it has the number of out going links from that page

and I have added another two columns "pagerank" and "pagerank2"

and another column called "i" which count the the number of iterations

now lets move to the programming

     $step="pg";
     for($i=0;$i<50;$i++){
         if($step=="pg2"){
             $step="pg";
         }else{
             $step="pg2";
         }
         $totalpages=5000;
         $sql1 = "select id from pages";
         $result1 = $DB->query($sql1);
         while($row1 = $DB->fetch_array($result1)){
             $page_id = $row1["id"];
             $sql = "select * from links where to_page_id = '$page_id'";
             $result = $DB->query($sql);
             $weights_of_links=0;//sum of pageranks/number of outgoing links
             while($row = $DB->fetch_array($result)){
                   $from_page_id = $row["from_page_id"];
                   $row2 = get_record_select("pages","id = '$from_page_id'");
                   $outgoinglinks = $row2["outgoinglinks"];
                   if($step=="pg2"){
                           $from_page_id_pagerank = $row2["pagerank2"];
                   }else{
                           $from_page_id_pagerank = $row2["pagerank"];
                   }

                   $weights_of_links +=($from_page_id_pagerank/$outgoinglinks );
             }

            //final step I tried to write the formula from wikipedia and the paper I have referred to
            $pagerank = .15/$totalpages + .85*($weights_of_links);
            //update the pagerank
           $ii = $i+1;
           if($step=="pg2"){
                 update_record("pages","id='$url_id'","pagerank='$pagerank',i='$ii'");
           }else{
                 update_record("pages","id='$url_id'","pagerank2='$pagerank',i='$ii'");
           }
         }
      }

note:

before you start make sure to set the pagerank of one of the pages (any page) to 1 and leave other pages with 0

why two pageranks columns?

I did that because I think we should separate every iteration to have an accurate calculation so our script will alternate between those two columns, every iteration will do the processing for one of the page rank columns and save the new results to the other pagerank column

the previous code will loop for many times to get an accurate results like 50 times each time we will get closer to the real pageranks for our pages

my question is, if the sum of all the pageranks in my network should be equal 1! if yes how is google giving every page a rank out of 10?!

any ideas?

Thanks

OTHER TIPS

Why do you need exactly PageRank if that's your own network? Why not just to calculate the total number of links from unique pages to a particular page and use this number as a page rating?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top