Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Indexing data in Hibernate Search

I just started integrating Hibernate Search with my Hibernate application. The data is indexed by using Hibernate Session every time I start the server.

FullTextSession fullTextSession = Search.getFullTextSession(session);
Transaction tx = fullTextSession.beginTransaction();

List books = session.createQuery("from Book as book").list();
for (Book book : books) {
    fullTextSession.index(book);
}

tx.commit(); //index is written at commit time     

It is very awkward and the server takes 10 minutes to start. Am I doing the this in right way?

I wrote a scheduler which will update the indexes periodically. Will this update the existing index entries automatically, or create duplicate indices?

like image 532
Shashi Avatar asked Dec 22 '22 11:12

Shashi


2 Answers

As detailed in the Hibernate Search guide, section 3.6.1, if you are using annotations (by now the default), the listeners which launch indexing on store are registered by default:

Hibernate Search is enabled out of the box when using Hibernate Annotations or Hibernate EntityManager. If, for some reason you need to disable it, set hibernate.search.autoregister_listeners to false.

An example on how to turn them on by hand:

 hibConfiguration.setListener("post-update", new FullTextIndexEventListener());
 hibConfiguration.setListener("post-insert", new FullTextIndexEventListener());
 hibConfiguration.setListener("post-delete", new FullTextIndexEventListener());

All you need to do is annotate the entities which you want to be indexed with the

@Indexed(index = "fulltext")

annotation, and then do the fine-grained annotation on the fields, as detailed in the user guide.

So you should neither launch indexing by hand when storing, neither relaunch indexing whae the application starts, unless you have entities which have been stored before indexing was enabled.

You may get performance problems when you are storing an object which say has an "attachment" and so you are indexing that in the same scope of the transaction which is storing the entity. See here:

Hibernate Search and offline text extraction

for a solution that solves this problem.

like image 166
Pietro Polsinelli Avatar answered Dec 31 '22 21:12

Pietro Polsinelli


Provided you are using a FSDirectoryProvider (which is the default) the Lucene index is persisted on disk. This means there is no need to index on very startup. If you have existing database you want of course to create an initial index using the fullTextSession.index() functionality. However, this should not be on application startup. Consider exposing some sort of trigger url, or admin interface. Once you have the initial index I would recommend to use automatic indexing. This means that the Lucene index gets automatically updated if a books get created/updated/deleted. Automatic indexing should also be enabled by default.

I recommend you refer to the automatic and manual indexing sections in the online manual - http://docs.jboss.org/hibernate/stable/search/reference/en/html_single

--Hardy

like image 39
Hardy Avatar answered Dec 31 '22 21:12

Hardy