Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is Amazon Faceted Search so fast?

Tags:

algorithm

Search for a term on amazon.com, for example "stack overflow", and the search results come back very quickly.

On the left hand side of the window, there is a faceted search that shows in certain categories, the count of products that match that term.

You can then drill into those terms. For example, there are 1094 books that match the term, which is broken down into Computers & Internet (1003), Science, etc.

Given that the search for books covers the contents of some of those books, it strikes me that this is a very impressive feat.

How does amazon do this? Massive parallelization? eg each node knows about a few products?

Incidentally, I saw that "stack overflow" appears in the text of "Soul of a New Machine", a book I remember from 1981

like image 621
Alan Avatar asked Feb 17 '09 01:02

Alan


1 Answers

The short answer is, a lot of indexing. The longer answer is, a lot of indexing, a lot of redundancy, a lot of caching, and smart partitioning.

The real answer is -- read this book: http://www-csli.stanford.edu/~hinrich/information-retrieval-book.html

(It's free, and it's very good).

like image 159
SquareCog Avatar answered Sep 28 '22 17:09

SquareCog