PDA

View Full Version : Google's secret formula


Susie
09-27-2007, 03:51 AM
http://www.portfolio.com/culture-lifestyle/goods/gadgets/2007/08/13/How-Google-Works


Google's Secret Formula
by Paul Smalera September 2007 Issue
In the past 12 months, Google doubled its staff, tinkered with its search engine to speed up results, and now answers more queries than Microsoft and Yahoo combined. But there’s one query we had to answer ourselves: How does Google work?
Gadgets
Interactive Features Inside the Google Search
google headquarters
View Interactive Feature
A step by step guide to how "googling" works. See All Video & Multimedia
RELATED CONTENT
From Portfolio

* Don't You Love It When Tech Monopolies Fight?
* Google Rides Web-Ad Boom to New High
* Zell doesn't get the web
* Microsoft and Google: Tit for Tat
* GooglePoint? PowerOogle?
* Australia: Ahead of the Regulatory Curve

News

* Google health exec Adam Bosworth to leave company Reuters
* Google sees Web search less exposed to mortgage woes Reuters
* Rivals say Microsoft has not changed its ways Reuters
* Google wants American Airlines' trademark infringement suit dropped Sacramento
* Trademark plaintiff drops suit vs. Google over ads Reuters
* Capgemini partners Google Apps software Reuters

See All Related Content
Resources
Brin-ful
sergey
by Zubin Jelveh
Sergey Brin is considered Google's vision man. How did he get there? Read More
Resources
A Page Turner
Larry Page
by Portfolio Staff
Larry Page, the president of products at Google which he founded with his Stanford classmate, has become one of the richest thirtysomethings in the world. Read More
Join the Conversation
1 Comment
Latest: Sep 19 2007 11:57pm ET

* Add a Comment
* Read All Comments

Start the Conversation

* Add a Comment

Google

Blame spell-check. Ten years ago this September, so the story goes, some Stanford grad students were helping Larry Page choose a name for his search engine. “Googolplex,” said Sean *Anderson. (They’d already sensed how big this could *become.) “Googol,” Page *replied. *Anderson, checking to see if the name was taken, typed *g-o-o-g-l-e into his browser and made the most famous spelling mistake since p-o-t-a-t-o-e. Page registered the name within hours, and today, Google isn’t a typo, it’s a verb, one with a market cap of about $160 billion. Here, then, is a guide to what happens during a typical Google search—now, of course, with automatic spell-check.

1. Query Box
It all starts with somebody typing in a request for information about the safest dog food, what time the D.M.V. closes, or what the prime rate is in China.

2. Domain-Name Servers
“Hello, this is your operator . . . ”
The software for Google’s domain-name servers runs on computers in leased or company-owned data centers all over the world, including one in the old Port Authority headquarters in Manhattan. Their sole purpose is to shepherd searches into one of Google’s clusters as efficiently as possible, taking into account which clusters are nearest to the searcher and which are least busy at that instant.

3. The Cluster
The request *continues into one of at least 200 clusters, which sit in Google-owned data centers worldwide.

4. Google Web Server
This program splits a query among hundreds or thousands of machines so that they can all work on it at the same time. It’s the difference between doing your grocery shopping all by yourself and having 100 people simultaneously find one item and toss it into your cart.

5. Index Server
Everything Google knows is stored in a massive database. But rather than waiting for one computer to sift through those gigabytes of data, Google has hundreds of computers scan its “card catalog” at the same time to find every relevant entry. Popular searches are cached—held in memory—for a few hours rather than run all over again. That means you, Britney.

6. Document Server
After the index server compiles its results, the document server pulls all the relevant documents—the links and snippets of text from its massive database. How does Google search the Web so quickly? It doesn’t. It keeps three copies of all the information from the internet that it has indexed in its own document servers, and all those data have already been prepped and sorted.