Learning Google AI



Big Picture: Linear Algebra 

The Big Picture of Linear Algebra 线性代数 (MIT Open-Courseware by the famous Prof Gilbert Strang)

A.X = B

A(m,n) = \begin{pmatrix} a_{11} &  a_{12} & \ldots & a_{1n}\\ a_{21} & a_{22} & \ldots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{m1}&a_{m2} &\ldots & a_{mn} \end{pmatrix}

A(m, n)  is a matrix of m rows, n columns

4 sub-Vector Spaces:
Column Space {C(A)} , dim = r = rank (A) 
Row Space {C(A^{T})} , dim = r = rank (A)

Nullspace {N(A)}   {\perp C(A^{T})} , dim = n – r

Left Nullspace {N(A^{T})} {\perp C(A)} , dim = m – r

Abstract Vector Spaces ​向量空间

Any object satisfying these 8 axioms belong to the algebraic structure of Vector Space: eg. Vectors, Polynomials, Functions, …

Note: “Vector Space” + “Linear Map” = “CategoryVect_{K}

Eigenvalues & Eigenvectors (valeurs propres et vecteurs propres) 特征值/特征向量

[ Note: “Eigen-” is German for Characteristic 特征.]

Important Trick: (see Monkeys & Coconuts Problem)
If a transformation A is linear, and the “before” state and “after” state of the “vector” v remain the same  (keep the status-quo) , then : Eigenvalue \boxed {\lambda = 1}

\boxed {A.v = \lambda.v = 1.v  = v}

Try to compute:
{ \begin{pmatrix} a_{11} &  a_{12} & \ldots & a_{1n}\\ a_{21} & a_{22} & \ldots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{n1}&a_{n2} &\ldots & a_{nn} \end{pmatrix} }^{1000000000}
is more difficult than this diagonalized equivalent matrix:
{ \begin{pmatrix} b_{11} & 0 & \ldots & 0\\ 0 & b_{22} & \ldots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 &0 &\ldots & b_{nn} \end{pmatrix} }^{1000000000}

= { \begin{pmatrix} {b_{11}}^{1000000000}&0 &\ldots &0\\ 0 &{b_{22}}^{1000000000} &\ldots & 0\\ \vdots &\vdots & \ddots & \vdots\\ 0 &0 &\ldots &{b_{nn}}^{1000000000} \end{pmatrix}}

Note: This is the secret of the Google computation of diagonalized matrix of billion columns & billion rows, where all the bjk are the “PageRanks” (web links coming into a particular webpage and links going out from that webpage).

The Essence of Determinant (*): (行列式)

(*) Determinant was invented by the ancient Chinese Algebraists 李冶 / 朱世杰 /秦九韶 in 13th century (金 / 南宋 / 元) in《天元术》.The Japanese “和算” mathematician 关孝和 spread it further to Europe before the German mathematician Leibniz named it the “Determinant” in 18th century. The world, however, had to wait till the 19th century to discover the theory of Matrix 矩阵 by JJ Sylvester (Statistical Math private Tutor of Florence Nightingale, the world’s first nurse) closely linked to the application of Determinant.

[NOTE] 金庸 武侠小说 《神雕侠女》里 元朝初年的 黄蓉 破解 大理国王妃 瑛姑 苦思不解的 “行列式”, 大概是求 eigenvalues & eigenvectors ? 🙂

Google PageRanking Algorithm

Google Illustration:

The following Webpages (1) to (n=6) are linked in a network below:
Page (1) points to (4),
(2) & (3) points to (1)…


a_{ij} = Probability (PageRank *) from Page ( i ) linked to Page (j).

(*) PageRank: a measure of how relevant the page’s content to the topic of your query. This value is computed by the proprietary formula designed by the 2 Google Founders Larry Page & Sergey Brin, whose Stanford Math Thesis mentor was Prof Tony Chan (who knows the ‘secret ‘ to put his name always on Google 1st search list.)

The Markov Transition Matrix (A) is :

A = \begin{pmatrix} a_{11} & a_{12}& \ldots & a_{1n}\\ a_{21} & a_{22} & \ldots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{n1} & a_{n2} &\ldots & a_{nn} \end{pmatrix}

Assume we start surfing from Page (1).

We define PageRank Vector x
x = (1 0 0 0…0), the Probability of reaching from Page (1) to itself is 1, to other pages is 0.

First Iteration:

x.A = (1 0 0 0…0).\begin{pmatrix} a_{11} & a_{12}& \ldots & a_{1n}\\ a_{21} & a_{22} & \ldots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{n1} & a_{n2} &\ldots & a_{nn} \end{pmatrix}
x.A = \begin{pmatrix} a_{11} & a_{12}& \ldots & a_{1n} \end{pmatrix} = x_1

2nd Iteration:
x_{1}.A = \begin{pmatrix} a'_{11} & a'_{12}& \ldots & a'_{1n} \end{pmatrix} = x_2
x_{1}.A = (x.A).A = x.A^{2} = x_2


nth Iteration:
x_{n} = x.A^{n}

When n is large,
x_{n} converges to a steady-state vector, ie
x_{n-1} \approx x_{n}
That is,
x.A^{n-1} \approx x.A^{n}
“Cancel off” both sides by A^{n-1} (technically multiply both sides by A^{-(n-1)}
So we get,
x \approx x.A

We say that x is a Left EigenVector of A if
\boxed { x.A = x }

By Perron’s Theorem:
Every real square matrix with entries that are all positive has
■ a unique eigenvector “x” with all positive entries;
■ the x‘s corresponding eigenvalue ” λ” has only one associated eigenvector, and
■ this eigenvalue “λ” is the largest of the eigenvalues.

Applying to A (square matrix with positive real numbers),
=> one and only one (left) eigenvector x which satisfies x.A = x
=> x is unique and has all positive entries (PageRank values).

It guarantees that no matter how much the Web changes or what set of Web pages Google indexes, the PageRank vector (x) can always be found and will be unique !

Example in the above Illustration:
If the unique PageRank vector x is computed after n iterations as follow:

x = \begin{pmatrix} 0.25\\ 0.35\\ 0.63\\ 0.05\\ 0.12\\ 0.47 \end{pmatrix}
Then Google will list the search result in this order:
1st: Page 3 (0.63)
2nd: Page 6 (0.47)
3rd: Page 2 (0.35)
4th: Page 1 (0.25)
5th: Page 5 (0.12)
6th: Page 4 (0.05)

Knowing this trick, some hackers in 2003 initiated a “Google Bomb” attack on President George W. Bush, associated him with Google query “miserable failure“.

Linear Algebra (Left & Right Eigenvectors and eigenvalues).

《Math Bytes》by Tim Charter
Princeton University Press
[NLB #510 CHA]

Google Patents: