How to make inverted index search faster?

Question

I am designing an architecture of full-text search engine. One of the points is processing queries among large datasets with few response time. One thing I could figure out is that to split the inverted index into partitions. There are 2 strategies for this: term-based partition and document-based partition. But I really want to know if there is any other way to make inverted search faster among large datasets?

Felipe Hummel · Accepted Answer

This video is a speech with Shay Banon, the developer of ElasticSearch a distributed full-text search engine. In the video he discusses the pros and cons of term-based partition and document-based partition.

Basically, term-based partition produces too much network bandwidth between processes/nodes. And it is harder to implement nicely. Document-based is extremely simpler to implement and produce results.

Moreover, in this lecture by Jeffrey Dean he also explains the differences and says that Google uses document-based partition.

This is the two main ways to distribute your search engine. I'm not aware of other ways of doing it. Anyway you may want to search the Information Retrieval literature for novel work on the subject.

How to make inverted index search faster?

Tags:

algorithm

search

full-text-search

parallel-processing

information-retrieval

Mickey Shine

1 Answers

Felipe Hummel

Recent Activity

Donate For Us

How to make inverted index search faster?

Tags:

algorithm

search

full-text-search

parallel-processing

information-retrieval

Mickey Shine

1 Answers

Felipe Hummel

Related questions

Recent Activity

Donate For Us