As the internet rapidly grew in the 1990s, it became increasing difficult to find the right webpage or nugget of information. The internet was missing a homepage that could be a portal to the rest of the web.

Imagine there is a hypothetical random surfer of the internet (usually called a "spider"). Upon visiting a page on the internet, the surfer takes at random one of the outgoing links on that page. Here, one should picture the internet as a graph where a node represents a webpage and a directed edge joins node $i$ to node $j$ if page $i$ has an outgoing link to page $j$.

Let $N$ be the number of webpages on the internet and $P$ be the $N\times N$ stochastic matrix (nonnegative entries and columns sum to $1$) such that $P_{ij}$ is the probability of transitioning from page $j$ to page $i$. We want to find the eigenvector of $\underline{x}$ corresponding to the eigenvalue $1$ so that $$ P\underline{x} = \underline{x}. $$ To interpret the vector $\underline{x}$ as a way of ranking webpages take page $j$ as more important than page $i$ if $x_j>x_i$. Since all the columns of $P$ add to $1$, we know that $P^T$ has an eigenvalue of $1$ with eigenvector $(1,\ldots,1)^T/N$ and hence, a solution to $P\underline{x} = \underline{x}$ exists. The trouble is there can be many such vectors $\underline{x}$ and the entries of $\underline{x}$ are not necessarily nonnegative. There can be many different rankings and $x_j = 0$ is possible so that a webpage is left unranked. For example, $$ P = \begin{bmatrix} 0 & 1 & 0 & 0\\ 1 & 0 & 0 & 0\\0 & 0 & 0 & 1\\0 & 0 & 1 & 0\\ \end{bmatrix} $$ has eigenvalues $1$, $1$, $-1$, and $-1$ and leads to a nonunique ranking:

In [1]:

```
P = [ 0. 1. 0. 0. ; 1. 0. 0. 0.; 0. 0. 0. 1. ; 0. 0. 1. 0.]
(Λ,V) = eig(P')
V[:,3], V[:,4]
```

Out[1]:

We need a way to make the page rankings unique. That's the main idea behind Google's PageRank algorithm.

One expects that Larry Page knew the following result:

** Theorem: **
If $P$ is a stochastic matrix with strictly positive entries, then $1$ is an eigenvalue, all other eigenvalues have a modulus $<1$, ${\rm ker}(A-I) = 1$, and the eigenvector with eigenvalue $1$ has positive entries.

** Proof:** Since $P$ is a stochastic matrix, $P^T\underline{e} = \underline{e}$, where $\underline{e} = (1,\ldots,1)^T$. Therefore, $1$ is an eigenvalue of $P^T$ and hence, also $P$.

Suppose that there is an eigenvalue $\lambda$ of $P$ with eigenvector $\underline{v}$ such that $|\lambda|>1$. Suppose that $\underline{v}$ is normalized so that its entries sum to $1$. Then, $P^k\underline{v} = \lambda^k\underline{v}$ has exponentially growing length as $k\rightarrow \infty$. This implies that for large $k$, $P^k$ must have an entry larger than $1$, but $P^k$ is a stochastic matrix. Therefore, all eigenvalues much have modulus $\leq 1$.

The rest of this theorem follows directly from the Perron-Frobenius theorem.

In [2]:

```
# The eigenvalues of a stochastic matrix with strictly positive entries ensure there is a unique page ranking:
α = .85
N = 10
P = rand(N,N)
P = P./sum(P,1)
ev = ones(N)
abs(eigvals( α*P + (1-α)ev*ev'/N ))
```

Out[2]:

Most researchers consider the problem of computing the vector $\underline{x}$ such that $$ (\alpha P + (1-\alpha)ve^T)\underline{x} = \underline{x}, $$ as an example of an eigenvalue problem. However, since the eigenvalue of $1$ is known, we can also treat it as the following linear system to solve: $$ (I - \alpha P)\underline{x} = (1-\alpha)v, $$ where we used the fact that $e^T\underline{x} = 1$. Page's suggestion is equivalent to computing the vector $\underline{x}$ via the iteration: $$ x^{(k+1)} = \alpha P \underline{x}^{(k)} + (1-\alpha)v, \qquad x^{(0)} = (1,\ldots,1)^T/N. $$ This iteration is the power iteration (if thought as the eigenvalue problem $(\alpha P + (1-\alpha)ve^T)\underline{x} = \underline{x}$) and Richardson's iteration (if thought as the linear system $(I - \alpha P)\underline{x} = (1-\alpha)v$).

Let $0<\alpha<1$ and consider the error $\underline{x}-\underline{x}^{(k+1)}$:

$$ \underline{x} - \underline{x}^{(k+1)} = \alpha P \underline{x} + (1-\alpha)v - \left(\alpha P \underline{x}^{(k)} + (1-\alpha)v\right) = \alpha P\left(\underline{x} - \underline{x}^{(k)}\right) = \cdots = \alpha^k P^k\left(\underline{x} - \underline{x}^{(0)}\right). $$Therefore, the iteration for finding $\underline{x}$ converges geometrically as $k\rightarrow\infty$. The implementation is simple too.

In [3]:

```
function PageRank( P::Matrix, α::Number, v::Vector )
# Rank pages in the web, given by $P$, with a teleporting parameter of α.
n = size(P,1)
x = ones(n)/n
maxIter = ceil(Int,log(eps(Float64))/log(α))
for k = 1:maxIter
x = α*(P*x) + (1-α)*v
end
return x
end
```

Out[3]:

In [4]:

```
α = .85
N = 10
P = rand(N,N)
P = P./sum(P,1)
v = ones(N)/N
x = PageRank(P, α, v)
```

Out[4]:

A dangling webpage is one without any outgoing links. If the surfer lands on such a page, then we will magically teleport to another webpage on the internet. We write $$ (\alpha P + \alpha D + (1-\alpha)ve^T)\underline{x} = \underline{x}, \qquad v = (1,\ldots,1)^T/N $$ where $D = ve_{\text{hanging}}^T$ and $e_{\text{hanging}}$ is a vector such that the $j$th entry is $1$ if the $j$th webpage is dangling. We modify our code below to deal with dangling webpages.

In [5]:

```
function PageRankDangling( P::SparseMatrixCSC, α::Float64, v::Array{Float64,1} )
# Rank pages in the web, given by $P$, with a teleporting parameter of α. A matrix "D" is calculated
# to treat dangling webpages.
n = size(P,1)
s = sum(P,1)
P = P./s
eDangling = ( s[:] .== 0 )
x = ones(n)/n
# Richard's iteration to solve (αP + αD + (1-α)ve^T)x = x:
maxIter = ceil(Int,log(eps(Float64))/log(α))
for k = 1:maxIter
x = α*(P*x) + α*(dot(eDangling,x)*v) + (1-α)*v
end
return x
end
```

Out[5]:

In [6]:

```
using PyPlot
srand(1234)
n = 5000
α = .85
P = sprand(Bool, n, n, 10/n) # adjacency of outgoing links
v = ones(n)/n
@time x = PageRankDangling(P, α, v)
s = sortperm( x )
println("Top-ranking webpages: ",flipdim(s[end-9:end],1))
```

We note that we can store $P$ as $P = GS$, where $G$ is a $0$-$1$ matrix and $S$ is diagonal. While $Px$ costs $nnz(P)$ additions and $nnz(P)$ multiplications, $G(Sx)$ costs $N$ multiplications (for $Sx$) and $nnz(P)$ additions for $G*(Sx)$.

In [7]:

```
function PageRankDangling_improvement( P::SparseMatrixCSC{Bool,Int64}, α::Float64, v::Array{Float64,1} )
# Rank pages in the web, given by $P$, with a teleporting parameter of α. A matrix "D" is calculated
# to treat dangling webpages.
n = size(P,1)
s = sum(P,1)[:]
# Normalize P and catch dangling webpages:
eDangling = ( s .== 0 )
x = ones(n)/n
# Richard's iteration to solve (αP + αD + (1-α)ve^T)x = x:
for k = 1 : ceil(Int,log(1e-6)/log(α))
x = α*(P*(x./s)) + (α*(dot(eDangling,x)-1) + 1)*v
end
return x
end
```

Out[7]:

In [8]:

```
srand(1234)
n = 100000
α = .85
P = sprand(Bool, n, n, 10/n) # adjacency of outgoing links
v = ones(n)/n
@time x = PageRankDangling_improvement(P, α, v)
s = sortperm( x[:] )
println("Top-ranking webpages: ",flipdim(s[end-9:end],1))
```

Larry Page co-founded Google Inc. with Sergey Brin in 1998. His PhD at Stanford was on understanding the internet and while studying this he came up with the beginnings of Google's PageRank algorithm.