Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lucene good practice and thread safety

i'm using lucene to index documents and perform a search after which, i immediately delete them. all this can be considered as a somewhat atomic action that includes the following steps:

index (writer) --> search (searcher) --> get docs by score (reader) --> delete docs (reader)

this action can be performed by multiple concurrent threads on the same index (using FSDirectory).

IMPORTANT NOTE: each thread handles a separate set of documents so one thread will not touch another thread's documents

for that purpose i have a few questions:

1) should i use a single instances (for all threads) of IndexWriter, IndexReader and IndexSearcher? (they're supposed to be thread safe)

2) can an IndexWriter manipulate an index while and IndexReader deletes documents? do i need to close one for the other to do its thing? meaning, can one thread write to an index while another one deletes from it (as i mentioned earlier, i can guarantee that they handle separate sets of data)

3) any other good practices and suggestions you might have will be most appreciated.

thanks a lot!

like image 839
levtatarov Avatar asked Jan 16 '12 10:01

levtatarov


1 Answers

IndexWriter, IndexReader and IndexSearcher are thread-safe according to the api javadoc:

NOTE: IndexSearcher instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently

NOTE: IndexReader instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently.

NOTE: IndexWriter instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently

Multiple read-only IndexReaders can be opened, but it's better to share one (for performance reasons).

Only a single IndexWriter can be opened (and it will create a write lock to prevent others from being opened on the same index). You can use IndexReader to delete documents while IndexWriter holds this lock. IndexReader will always see the index as it was at the time when it was opened, changes done by the writer will be visible only after the writer commits them the reader is reopened.

Any number of IndexSearchers can be opened, but again it's better to share one. They can be used even while the index is being modified. Works the same as for IndexReader (the changes are not visible until the searcher is reopened).

like image 89
milan Avatar answered Nov 14 '22 17:11

milan