The Mathematics of Web Search

Raluca Tanase (e-mail), Remus Radu (e-mail)

Math Explorer's Club

During the past two decades we assisted at a technological boost. From the first 3 web sites that the baby Internet had (Microsoft, Netscape, and Amazon), to 800 million pages in 1999 and to over 50 billion pages today, the Internet has experienced an exponential growth. Even a casual surf on the Internet is enough to convince you that there is an enormous amount of information and links available online. However, all this information is useless unless we have a way of searching and sorting it. From the first search engine Archie in 1990, to the modern search engines we use today, the problem of deciding the relevance of the information available online has been a crucial issue.

In this module we try to analyze the mathematics behind one of the most popular search engines, Google. The first two lectures are self-contained and they can be read independently. The subsequent two lectures use however the material previously developed.

  1. Lecture #1: Linear Algebra - in a Nutshell
  2. Lecture #2: Directed Graphs - Transition Matrices
  3. Lecture #3: PageRank Algorithm - The Mathematics of Google Search
  4. Lecture #4: HITS Algorithm - Hubs and Authorities on the Internet
  5. Lecture #5: Internet Mathematics - Further Directions
  6. References

This work was made possible due to a grant from the National Science Foundation.

A snapshot of the Internet graph. Image provided by Wikipedia.