Google PageRanking Algorithm

Google Illustration:

The following Webpages (1) to (n=6) are linked in a network below:
eg.
Page (1) points to (4),
(2) & (3) points to (1)…

image

Let
a_{ij} = Probability (PageRank *) from Page ( i ) linked to Page (j).

(*) PageRank: a measure of how relevant the page’s content to the topic of your query. This value is computed by the proprietary formula designed by the 2 Google Founders Larry Page & Sergey Brin, whose Stanford Math Thesis mentor was Prof Tony Chan (who knows the ‘secret ‘ to put his name always on Google 1st search list.)

The Markov Transition Matrix (A) is :

A = \begin{pmatrix} a_{11} & a_{12}& \ldots & a_{1n}\\ a_{21} & a_{22} & \ldots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{n1} & a_{n2} &\ldots & a_{nn} \end{pmatrix}

Assume we start surfing from Page (1).

We define PageRank Vector x
x = (1 0 0 0…0), the Probability of reaching from Page (1) to itself is 1, to other pages is 0.

First Iteration:

x.A = (1 0 0 0…0).\begin{pmatrix} a_{11} & a_{12}& \ldots & a_{1n}\\ a_{21} & a_{22} & \ldots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{n1} & a_{n2} &\ldots & a_{nn} \end{pmatrix}
x.A = \begin{pmatrix} a_{11} & a_{12}& \ldots & a_{1n} \end{pmatrix} = x_1

2nd Iteration:
x_{1}.A = \begin{pmatrix} a'_{11} & a'_{12}& \ldots & a'_{1n} \end{pmatrix} = x_2
or,
x_{1}.A = (x.A).A = x.A^{2} = x_2

.
.

nth Iteration:
x_{n} = x.A^{n}

When n is large,
x_{n} converges to a steady-state vector, ie
x_{n-1} \approx x_{n}
That is,
x.A^{n-1} \approx x.A^{n}
“Cancel off” both sides by A^{n-1} (technically multiply both sides by A^{-(n-1)}
So we get,
x \approx x.A

We say that x is a Left EigenVector of A if
\boxed { x.A = x }

By Perron’s Theorem:
Every real square matrix with entries that are all positive has
■ a unique eigenvector “x” with all positive entries;
■ the x‘s corresponding eigenvalue ” λ” has only one associated eigenvector, and
■ this eigenvalue “λ” is the largest of the eigenvalues.

Applying to A (square matrix with positive real numbers),
=> one and only one (left) eigenvector x which satisfies x.A = x
=> x is unique and has all positive entries (PageRank values).

It guarantees that no matter how much the Web changes or what set of Web pages Google indexes, the PageRank vector (x) can always be found and will be unique !

Example in the above Illustration:
If the unique PageRank vector x is computed after n iterations as follow:

x = \begin{pmatrix} 0.25\\ 0.35\\ 0.63\\ 0.05\\ 0.12\\ 0.47 \end{pmatrix}
Then Google will list the search result in this order:
1st: Page 3 (0.63)
2nd: Page 6 (0.47)
3rd: Page 2 (0.35)
4th: Page 1 (0.25)
5th: Page 5 (0.12)
6th: Page 4 (0.05)

Knowing this trick, some hackers in 2003 initiated a “Google Bomb” attack on President George W. Bush, associated him with Google query “miserable failure“.

Revision:
Linear Algebra (Left & Right Eigenvectors and eigenvalues).

Ref:
《Math Bytes》by Tim Charter
Princeton University Press
[NLB #510 CHA]

Google Patents:
https://www.google.com/patents/US6285999

谷歌背后的数学:

Advertisements

One thought on “Google PageRanking Algorithm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s