Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Doubts about page rank

I am trying to find the internal page rank of Wikipedia using Mapreduce. I implemented my Pagerank algorithm on a small subset of wikipages. There are 6349 pages. I used this formula to calculate the pagerank (d = 0.85).

enter image description here

I wanted to verify if the sum of all the pagerank is equal to the total number of pages(6349).

What I found so far:

1.The total page rank of all the 6349 pages is 1001.26044

2.According to WikiPedia if I use the above formula then each PageRank is multiplied by N and the sum becomes N. I multiplied each page rank by N (6349) and calculated the sum, I got 6356789.5.

Is there a reason why the sum of page ranks is not equal to the total number of pages? Should I use the second formula to verify ?

enter image description here

Note: I ran my mapreduce code for 10 iterations to get a good approximation.

like image 429
yesh Avatar asked Nov 26 '12 20:11

yesh


1 Answers

As I suppose, you have too few iterations. Why 10? Why 100? Or 100000? You should count, what are the mediums or maximums of the two last changes. And thus evaluate the possible error.

And the PR is a probability. The sum of all of them should be 1! The sentence "sum of all the pagerank is equal to the total number of pages" is wrong.

As for another formula, it belongs to another model and another PR. Of course, you can use it too. Or both. But you can't check using it.

like image 170
Gangnus Avatar answered Sep 19 '22 22:09

Gangnus