Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Lucene.Net suitable as the search engine for frequently changing content?

Is Lucene.Net suitable as the search engine for frequently changing content?

Or more specificically, can anybody give a subjective opinion on how quickly lucene.net indexes can be updated. Any other approaches to searching frequently changing content would be great.

We’re developing a forum. Forum posts will be frequently added to the forum repository. We think we need these posts to be added to lucene index very quickly (<0.5s) to become available to search. There’ll be about 5E6 posts in the repository initially. Assume search engine running on non-exotic server (I know this is very vague!).

Other suggestions with regard to addressing the issue of searching frequently changing content appreciated. The forum posts need to be searchable on a variable number of named tags (tag name and value must match). A SQL based approach (based on Toxi schema) isn’t giving us the performance we’d like.

like image 678
Anthony Carroll Avatar asked Nov 07 '08 15:11

Anthony Carroll


1 Answers

Out forums (http://episteme.arstechnica.com) use Lucene as the search backend, so it's doable. Posts aren't indexed quite as quickly as you'd like, but we could solve that by beefing up the indexing hardware and using a smarter caching strategy.

The general answer to this question is: it depends what your write/update pattern is. Forums are relatively easy, since most content is new and existing content is updated less frequently.

For a forum, I'd recommend having an "archive" index and a "live" index. The live index might include posts from the last day, week, year, while the archive index will include a large body of posts that probably won't ever be touched again. So when someone creates a new post, it will initially be indexed in the live index. At a later time, some batch job would clear out the live index, and reindex everything into the archive.

Lucene's very good at querying across multiple indexes. You should abuse that ability. :)

like image 143
MrKurt Avatar answered Sep 19 '22 18:09

MrKurt