Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene performance

could you please suggest on the steps to be followed for lucene performance. especially with large data (around 1TB of pdf files to be indexed)

like image 998
KP. Avatar asked May 05 '09 13:05

KP.


1 Answers

  1. Read Scaling Lucene and Solr.
  2. Define your needs from Lucene (for example: you are indexing PDFs - do you need to store the full text, just to make it searchable, or not at all?)
  3. Make a small-scale experiment - index a few documents, see whether retrieval is good enough.
  4. Try to index the whole thing (considering the paper's tips for quick indexing and for indexing for retrieval speed) - Is retrieval good enough? Is performance good enough?
  5. Iterate.
like image 84
Yuval F Avatar answered Oct 11 '22 21:10

Yuval F