High-speed Googling

June 19, 2003
Computer science researchers at Stanford University have developed several new techniques for calculating Web-page rankings (as used in the Google search engine) five times faster.

Google currently ranks and searches 3 billion Web pages using a ranking algorithm called Computing PageRank. The program, however, can take several days to search and rank a billion Web pages. To speed the process, Stanford researchers developed a trio of techniques in numerical linear algebra. One uses extrapolation techniques, which make some assumptions about the Web's link structure that aren't true, but still permit quick and easy computation of PageRank. Because the assumptions are false, PageRank isn't completely accurate. However it is close enough to be refined using the original PageRank algorithm. Researchers have shown that their extrapolation techniques can speed up PageRank by 50%.

Another method, BlockRank, relies on a feature of the Web's link structure. The Stanford team found that approximately 80% of the pages on any given Web site point to other pages on the same site. That means they can compute many single-site PageRanks, glue them together, and use that as a starting point for the original PageRank algorithm. This could make PageRank computation three times faster.

Other Stanford findings show the rankings for some pages are calculated early in the PageRank process, while others of highly rated pages take much longer to compute. A method called Adaptive PageRank eliminates redundant computations associated with those pages whose PageRanks finish early. This speeds computation by as much as 50%. "Further speed-ups are possible when we use all these methods," says one Stanford researcher. Still, researchers say other issues must be solved. "We are closer to a topic-based PageRank than to a personalized ranking," he says.

Complex personalized rankings need even greater speed-ups to the PageRank calculations. And while a faster algorithm shortens computation time, storage problems remain. The results from a single PageRank computation on a few billion Web pages need several gigabytes of storage. Saving a limited number of topic-specific PageRank calculations is more practical, say researchers. For more information, go to the Stanford Database Group's Publication Server at http://dbpubs.stanford.edu/.

Sponsored Recommendations

Diaphragm Pump Technology Drives Industrial Washers

Jan. 23, 2025
Discover high-performance pumps and systems built to handle various gases, liquids, and chemicals with precision.

Harmonic Drive Actuators with Integrated Drive Technology

Jan. 17, 2025
Discover the future of motion control.In this video, we explore how integrated drive technology (IDT) from Harmonic Drive is revolutionizing the precision mo...

7 factors to Consider When Choosing the Right Gear Technology

Jan. 17, 2025
Choosing a drive involves several design factors that depend greatly on the task at hand. This top 7 list will guide you, whether your task requires precise and exact movements...

What are the Benefits of Actuators with Integrated Servo Drives?

Jan. 17, 2025
Actuators with Integrated Servo Drive Technology (IDT) simplify cable management, control hardware, and commissioning while achieving outstanding performance in a compact size...

Voice your opinion!

To join the conversation, and become an exclusive member of Machine Design, create an account today!