High-speed Googling

June 19, 2003
Computer science researchers at Stanford University have developed several new techniques for calculating Web-page rankings (as used in the Google search engine) five times faster.

Google currently ranks and searches 3 billion Web pages using a ranking algorithm called Computing PageRank. The program, however, can take several days to search and rank a billion Web pages. To speed the process, Stanford researchers developed a trio of techniques in numerical linear algebra. One uses extrapolation techniques, which make some assumptions about the Web's link structure that aren't true, but still permit quick and easy computation of PageRank. Because the assumptions are false, PageRank isn't completely accurate. However it is close enough to be refined using the original PageRank algorithm. Researchers have shown that their extrapolation techniques can speed up PageRank by 50%.

Another method, BlockRank, relies on a feature of the Web's link structure. The Stanford team found that approximately 80% of the pages on any given Web site point to other pages on the same site. That means they can compute many single-site PageRanks, glue them together, and use that as a starting point for the original PageRank algorithm. This could make PageRank computation three times faster.

Other Stanford findings show the rankings for some pages are calculated early in the PageRank process, while others of highly rated pages take much longer to compute. A method called Adaptive PageRank eliminates redundant computations associated with those pages whose PageRanks finish early. This speeds computation by as much as 50%. "Further speed-ups are possible when we use all these methods," says one Stanford researcher. Still, researchers say other issues must be solved. "We are closer to a topic-based PageRank than to a personalized ranking," he says.

Complex personalized rankings need even greater speed-ups to the PageRank calculations. And while a faster algorithm shortens computation time, storage problems remain. The results from a single PageRank computation on a few billion Web pages need several gigabytes of storage. Saving a limited number of topic-specific PageRank calculations is more practical, say researchers. For more information, go to the Stanford Database Group's Publication Server at http://dbpubs.stanford.edu/.

Sponsored Recommendations

How to Build Better Robotics with Integrated Actuators

July 17, 2024
Reese Abouelnasr, a Mechatronics Engineer with Harmonic Drive, answers a few questions about the latest developments in actuators and the design or engineering challenges these...

Crisis averted: How our AI-powered services helped prevent a factory fire

July 10, 2024
Discover how Schneider Electric's services helped a food and beverage manufacturer avoid a factory fire with AI-powered analytics.

Pumps Push the Boundaries of Low Temperature Technology

June 14, 2024
As an integral part of cryotechnology, KNF pumps facilitate scientific advances in cryostats, allowing them to push temperature boundaries and approach absolute zero.

The entire spectrum of drive technology

June 5, 2024
Read exciting stories about all aspects of maxon drive technology in our magazine.

Voice your opinion!

To join the conversation, and become an exclusive member of Machine Design, create an account today!