Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What searching algorithm/concept is used in Google?

What searching algorithm/concept is used in Google?

like image 488
Dhanapal Avatar asked Apr 01 '09 05:04

Dhanapal


People also ask

What type of search algorithm is used for Google?

PageRank (PR) is an algorithm used by Google Search to rank web pages in their search engine results.

How does Google's search algorithm work?

Google's algorithm does the work for you by searching out Web pages that contain the keywords you used to search, then assigning a rank to each page based several factors, including how many times the keywords appear on the page.

What is Google's SEO algorithm?

What is a Google algorithm for SEO? As mentioned previously, the Google algorithm partially uses keywords to determine page rankings. The best way to rank for specific keywords is by doing SEO. SEO essentially is a way to tell Google that a website or web page is about a particular topic.

Which algorithm is used in search engine?

Linear Search Algorithm Linear search algorithms are considered to be the most basic of all search algorithms as they require a minimal amount of code to implement. Also known as a sequential search, linear search algorithms are the simplest formula for search algorithms to use .


2 Answers

The Anatomy of a Large-Scale Hypertextual Web Search Engine

like image 194
Brian Campbell Avatar answered Sep 30 '22 16:09

Brian Campbell


Indexing

If you want to get down to basics:

Google uses an inverted index of the Internet. What this means is that Google has an index of all pages it's crawled based on the terms in each page. For instance the term Google maps to this page, the Google home page, and the Wikipedia article for Google, amongst others.

Thus, when you go to Google and type "Google" into the search box, Google checks its index of all terms available on the Internet and finds the entry for the term "Google" and with it the list of all pages that have that term referenced in it.

For veteran users:

Google's index goes beyond your simple inverted index, however. This is why Google is the best. Google's crawlers (spiders) are smart. Very smart. Beyond just keeping track of the terms that are on any given web page, they also keep track of words that are on related pages and link those to the given document.

In other words, if a page has the term Google in it and the page has a link to or is linked from another web page, the other page may be referenced in the index under the term Google as well. All this and more go into why a given page is returned for a given query.

If you want to go into why pages are ordered the way they are in your search results, that gets into even more interesting stuff.

Ranking

To get down to basics:

Perhaps one of the most basic algorithms a search engine can use to sort your results is known as term frequency-inverse document frequency (tf-idf). Simply put, this means that your results will be ordered by the relative importance of your search terms in the document. In other words, a document that has 10 pages and lists the word Google once is not nearly as important as a document that has 1 page and lists the word Google ten times.

For veteran users:

Again, Google does quite a bit more than your basic search engine when it comes to ranking results. Google has implemented the aforementioned, patented, PageRank algorithm. In short form, PageRank enhances the tf-idf algorithm by taking into account the populatirty/importance of a given page. At this point, popularity/importance may be judged by any number of factors that Google just wont tell us. However, at the most basic of levels, Google can tell that one page is more important than another because loads and loads of other pages link to it.

like image 21
dustyburwell Avatar answered Sep 30 '22 18:09

dustyburwell