Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple or single index in Lucene?

Tags:

lucene

I have to index different kinds of data (text documents, forum messages, user profile data, etc) that should be searched together (ie, a single search would return results of the different kinds of data).

  • What are the advantages and disadvantages of having multiple indexes, one for each type of data?

  • And the advantages and disadvantages of having a single index for all kinds of data?

Thank you.

like image 719
Bruno Reis Avatar asked Apr 30 '10 17:04

Bruno Reis


People also ask

Why is Lucene so fast?

Why is Lucene faster? Lucene is very fast at searching for data because of its inverted index technique. Normally, datasources structure the data as an object or record, which in turn have fields and values.

How does Lucene index work?

In Lucene, a Document is the unit of search and index. An index consists of one or more Documents. Indexing involves adding Documents to an IndexWriter, and searching involves retrieving Documents from an index via an IndexSearcher.

What is Lucene inverted index?

The Inverted Index is the basic data structure used by Lucene to provide Search in a corpus of documents. It's pretty much quite similar to the index in the end of a book.


1 Answers

If you want to search all types of document with one search , it's better that you keep all types to one index . In the index you can define more field type that you want to Tokenize or Vectore them . It takes a time to introduce to each IndexSearcher a directory that include indeces .

If you want to search terms separately , it would better that index each type to one index . single index is more structural than multiple index.

In other hand , we can balance our loading with multiple indeces .

like image 90
Mahdi Amrollahi Avatar answered Sep 19 '22 19:09

Mahdi Amrollahi