I am using the NHibernate.Search assembly and am looking for best practices around using this with multiple web servers.
We have plenty of space on our web servers to handle the indexes we are generating, so I am thinking the best route is to have the indexes on each web server. Then the classes that I am indexing, add a version column. My only question is if I do this, will NHibernate.Search be smart enough to pull the latest record and index it if let's say Web Server A updated the record and Web Server B's index is out of date?
The other option is to store the indexes on a shared file location and pull from that network resource. This seems like a less-than-ideal solution as it does not allow for great redundancy.
How are others solving this problem with NHibernate.Search and/or Lucene.NET indexes?
The very moment you decided to put your indexes in different machines you introduced the "distributed search" problem. Things like replication, redundancy, management, monitoring, search aggregation become important and interesting problems you need to address.
That said, Solr is one of the recommended solutions to the problem. More over, SolrNet can help you integrate it with Nhibernate.
I've used both projects in combination with Nhibernate and it may be a little confusing at the beginning but it pays off later on.
In your case, you could potentially run Solr in your web servers.
We use the Master/Slave approach that is supported by NHibernate Search and Lucene.net.
Each WebServer has a slave-copy of the index and does no indexing.
Everytime a webserver updates something, it sends a message to a backend service (we use Rhino ServiceBus with msmq) that does the indexing (By loading the updated object and re-indexing it).
Every 10 seconds (we need up-to-date searches - it is common practise to have as much as a 30 minute grace-period) the webserver will check for new versions of the index and fetch it if needed. It works pretty well as the changes are incremental, so a pull of the full index is only needed if we do an optimization or a total re-index.
If you need better speed - you could optimize it by using a ram-implementation on the webservers - but with quite complex wildcard searches on a 32 mb index we are still well below 10 ms for queries.
Another optimization would be to have the webserver do the indexing, but only send the incremental copy to the backend to append to the master index. This would save the DB a call from the backend service albeit at a certain amount of complexity as you have to go deep in the bowels of NHibernate Search/Lucene in order to do that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With