Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene's MultiSearcher vs IndexSearcher with MultiReader

I am about to write a near-realtime search application with distributed indexes. Now I wonder what is the correct approch to implement search over multiple indexes:

I have read about the MultiSearcher, so one approch would be:

IndexSearcher[] indexSearchers = new IndexSearcher[indexCount];

for (int i = 0; i < indexCount; i++) {
    File directory = new File(indexdir, String.valueOf(i));
    IndexWriter indexWriter = new IndexWriter(FSDirectory.open(directory), analyzer, IndexWriter.MaxFieldLength.LIMITED);

    IndexReader indexReader = indexWriter.getReader();
    indexSearchers[i] = new IndexSearcher(indexReader);
}

MultiSearcher searcher = new MultiSearcher(indexSearchers);

But as I see this is also possible:

IndexReader[] indexReader = new IndexReader[indexCount];

for (int i = 0; i < indexCount; i++) {
    File directory = new File(indexdir, String.valueOf(i));
    IndexWriter indexWriter = new IndexWriter(FSDirectory.open(directory), analyzer, IndexWriter.MaxFieldLength.LIMITED);

    indexReader[i] = indexWriter.getReader();        
}

IndexSearcher searcher = new IndexSearcher(new MultiReader(indexReader));

Is there any significant difference between these two approches? The second one would be easyer to handle if the reader is out of data, because I could just call MultiReader.reopen() instead of iterating over all IndexReaders, reopening them and than creating new IndexSearchers...

like image 597
woezelmann Avatar asked Mar 22 '12 10:03

woezelmann


1 Answers

You should use the second option: http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/all/org/apache/lucene/search/MultiSearcher.html

like image 78
jpountz Avatar answered Oct 13 '22 01:10

jpountz