Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

is it mandatory to optimize the lucene index after write?

Currently i am calling the optimize method of the indexwriter after the completions of the write. Since my data set is huge, it took long time ( and needs more space (2*actual size)) to optimize the index. I am very much concerned about this because lot of documents included frequently in the index.

So

  1. is it ok to turn off optimize?
  2. What are the performance implications, like how much slower the querying when its not optmized?

Cheers

like image 342
RameshVel Avatar asked Oct 12 '10 06:10

RameshVel


1 Answers

The Lucene FAQ says:

What is index optimization and when should I use it?

The IndexWriter class supports an optimize() method that compacts the index database and speeds up queries. You may want to use this method after performing a complete indexing of your document set or after incremental updates of the index. If your incremental update adds documents frequently, you want to perform the optimization only once in a while to avoid the extra overhead of the optimization.

If I decide not to optimize the index, when will the deleted documents actually get deleted?

Documents that are deleted are marked as deleted. However, the space they consume in the index does not get reclaimed until the index is optimized. That space will also eventually be reclaimed as more documents are added to the index, even if the index does not get optimized.

like image 138
cuh Avatar answered Nov 15 '22 12:11

cuh