Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to index and find numbers with Lucene.NET?

I've implemented full text search for a web site using Lucene.NET (Version 2.0). Indexing and searching works well, but I have one problem. If I look for numbers (phone numbers, product numbers etc.) as search terms, I don't get any resulting documents.

I'm using the Lucene.Net.Analysis.SimpleAnalyzer Class. I guess I have to change Analyzer and/or Tokenizer.

Any advice?

Thank you!

like image 960
splattne Avatar asked Nov 16 '08 17:11

splattne


1 Answers

When you build up a Lucene Document, you get to select different indexing options for each field. For fields you don't want tokenized, you need to select the Field.Index.UN_TOKENIZED option. This will keep your phone numbers and product numbers in tact.

I would also advise using the StandardAnalyzer, as its doesn't strip numbers out like SimpleAnalyzer.

It is also important you use the same analyzer for both indexing and searching, to get consistent results.

like image 187
Andrew Rimmer Avatar answered Oct 02 '22 04:10

Andrew Rimmer