Question

I would want to get some experience in using Hadoop and page rank. I am completed a simple implementation of page rank algorithm using Hadoop. Now, I plan to analyse the possible impacts of changing few algorithm parameters and studying how it affects the page rank. For now I am analyzing how the dangling nodes affect the page rank . Any suggestions as to what other variations could be made to this pagerank would greatly help me get some deeper knowledge.

Thanks

Était-ce utile?

La solution

A couple of variations as I know:

  • Weighted PageRank algorithm: assigns larger rank values to more important (popular) pages instead of dividing the rank value of a page evenly among its outlink pages.
  • Topic-sensitive pagerank.

    In the original PageRank algorithm for improving the ranking of search-query results, a single PageRank vector is computed, using the link structure of the Web, to capture the relative importance" ofWeb pages, independent of any particular search query. To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic.

  • Z. Gy¨ongi, H. Garcia-Molina, and J. Pedersen, “Combating link spam with trustrank,”
  • Also you can try HITS (Authoritative Sources in a Hyperlinked Environment).
  • Going further, you can try to apply the pagerank idea to other domain, like in TupleRank: Ranking Relational Databases using Random Walks on Extended K-partite Graphs.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top