Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene - Validate completeness of index

We are using Lucene 5.5.5 in order to allow full text search over our database content. We build the index after a database migration and make use of the near-real-time indexmanager in order to keep the index up-to-date. However, sometimes it may happen that the server is killed before the indexmanager could commit the index-entries that it still holds in memory.

In order to avoid having to always rebuild the index on server-startup, since it's quite slow, I was wondering whether the index could be checked for completion. I know there's the CheckIndex-Utility, but as far as I understood it can only check whether an index is broken, but not if it's complete.

Another option could be an indexer that doesn't fully rebuild but completes an already existing index.

What would be the best way to go about this? My goal is to waste as little time as possible on startup and have a complete index.

An obvious solution would be to not use the near-real-time indexmanager anymore I guess, but for now, I'd like to not consider that option.

like image 699
Marcel Avatar asked Nov 04 '19 08:11

Marcel


1 Answers

Indeed, near-real-time indexmanager buffers the modifications in a memory and as far as i know currently it's not possible to verify the completeness of index flushing.

So the solutions could be:

  1. switch to directory-based indexmanager. (drawback: worse performance comparing to near-real-time indexmanager )

  2. use healthcheck service that will check state of your app and update a flag isServerForciblyClosed in the db and if it's true then rebuild index on next startup. This service should be turned off in case of planned shutdown.

  3. create your own CustomIndexManager implementation extending built-in classes or completly from scratch implementing Indexmanager interface

like image 165
Mike Adamenko Avatar answered Oct 21 '22 16:10

Mike Adamenko