Let's say I search for "baby". Sphinx will grab all the documents that have "baby" in it, and then sort it using my own algorithm. (EXTENDED mode).
The question is, how can it sort so fast? How does it grab millions of records and then sort them within milliseconds?
Oh, you're asking about the magic. Sphinx (and Lucene, and many other search engines) are using an inverted index.
Basically, every document is chopped into tokens; The search index consists of a mapping from tokens to documents called the postings list. Processing a query includes going over the postings lists of the query terms and locating matching documents. To make this faster, the tokens are stored as a list of integers. This can be made even more efficient by compressing the index.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With