Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Searching hyphenated words with Lucene

Tags:

lucene

I want lucene to search for hyphenated words, for eg: energy-efficient or "energy-efficient" as one single word

So if the input is energy-efficient the tokenizer generates terms like energy or efficient or energy efficient or energy-efficient

Therefore lucene returns with pages containing both "energy efficient" and "energy-efficient", but I want it to return exclusively with pages for energy-efficient

So the question is how can I modify the standardtokenizer to search for energy-efficient as one whole word and not break it into separate words.

like image 981
Madhura Avatar asked Aug 31 '10 20:08

Madhura


1 Answers

Use WhitespaceAnalyzer instead of standardAnalyzer.
That will generate tokens dividing only on white space. But check for the other things that'll be changed.

like image 139
KaKa Avatar answered Nov 01 '22 14:11

KaKa