High-speed Googling

June 19, 2003
Computer science researchers at Stanford University have developed several new techniques for calculating Web-page rankings (as used in the Google search engine) five times faster.

Google currently ranks and searches 3 billion Web pages using a ranking algorithm called Computing PageRank. The program, however, can take several days to search and rank a billion Web pages. To speed the process, Stanford researchers developed a trio of techniques in numerical linear algebra. One uses extrapolation techniques, which make some assumptions about the Web's link structure that aren't true, but still permit quick and easy computation of PageRank. Because the assumptions are false, PageRank isn't completely accurate. However it is close enough to be refined using the original PageRank algorithm. Researchers have shown that their extrapolation techniques can speed up PageRank by 50%.

Another method, BlockRank, relies on a feature of the Web's link structure. The Stanford team found that approximately 80% of the pages on any given Web site point to other pages on the same site. That means they can compute many single-site PageRanks, glue them together, and use that as a starting point for the original PageRank algorithm. This could make PageRank computation three times faster.

Other Stanford findings show the rankings for some pages are calculated early in the PageRank process, while others of highly rated pages take much longer to compute. A method called Adaptive PageRank eliminates redundant computations associated with those pages whose PageRanks finish early. This speeds computation by as much as 50%. "Further speed-ups are possible when we use all these methods," says one Stanford researcher. Still, researchers say other issues must be solved. "We are closer to a topic-based PageRank than to a personalized ranking," he says.

Complex personalized rankings need even greater speed-ups to the PageRank calculations. And while a faster algorithm shortens computation time, storage problems remain. The results from a single PageRank computation on a few billion Web pages need several gigabytes of storage. Saving a limited number of topic-specific PageRank calculations is more practical, say researchers. For more information, go to the Stanford Database Group's Publication Server at http://dbpubs.stanford.edu/.

Sponsored Recommendations

NEW Low Profile, Ultra Compact Power Supplies

March 13, 2024
Learn more HERE about Altech's Power supplies!

Altech's Liquid Tight Strain Relifs Catalog

March 13, 2024
With experienced Product Engineers and Customer Service personnel, Altech provides solutions to your most pressing application challenges. All with one thought in mind - to ensure...

Industrial Straight-Through Cable Gland

March 13, 2024
Learn more about Altech's cable glands and all they have to offer for your needs!

All-In-One DC-UPS Power Solutions

March 13, 2024
Introducing the All-In-One DC-UPS, a versatile solution combining multiple functionalities in a single device. Serving as a power supply, battery charger, battery care module,...

Voice your opinion!

To join the conversation, and become an exclusive member of Machine Design, create an account today!