Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Google store the index? [closed]

Tags:

indexing

Lately I have been reading about web crawling, indexing and serving. I have found some information on the Google Web Masters Tool - Google Basics about the process that Google does to crawl the Web and serve the searches. What I am wondering is how they save all those indexs? I mean, that's a lot to store right? How do they do it?

Thanks

like image 820
Nobita Avatar asked Sep 01 '11 08:09

Nobita


People also ask

How does Google store its index?

Crawling: Google downloads text, images, and videos from pages it found on the internet with automated programs called crawlers. Indexing: Google analyzes the text, images, and video files on the page, and stores the information in the Google index, which is a large database.

Where is Google index stored?

Content that's accessed every second will end up being stored on RAM or SSDs. This represents a small amount of Google's entire index. The bulk of Google's index is stored on hard drives because, in Illyes' words, hard drives are cheap, accessible, and easy to replace.

How Google stores their data?

Like most search engines, Google indexes documents by building a data structure known as inverted index. Such an index obtains a list of documents by a query word. The index is very large due to the number of documents stored in the servers. The index is partitioned by document IDs into many pieces called shards.

How long does it take Google to index site changes?

It takes between 4 days and 4 weeks for your brand new website to be crawled and indexed by Google. This range, however, is fairly broad and has been challenged by those who claim to have indexed sites in less than 4 days.


1 Answers

I'm answering myself because I found some interesting stuff that talks about Google index:

  • In Google Webmasters YouTube Channel, Matt Cutts give us some references about the architecture behind Google Index: Google Webmaster YouTube Channel
  • One of those references, and from my point of view a worth reading, is this one: The Anatomy of a Large-Scale Hypertextual Web Search Engine

This helped me to understand it better, and I hope it help you too!

like image 164
Nobita Avatar answered Oct 21 '22 06:10

Nobita