I want lucene to search for hyphenated words, for eg: energy-efficient or "energy-efficient" as one single word
So if the input is energy-efficient the tokenizer generates terms like energy or efficient or energy efficient or energy-efficient
Therefore lucene returns with pages containing both "energy efficient" and "energy-efficient", but I want it to return exclusively with pages for energy-efficient
So the question is how can I modify the standardtokenizer to search for energy-efficient as one whole word and not break it into separate words.
Use WhitespaceAnalyzer
instead of standardAnalyzer
.
That will generate tokens dividing only on white space. But check for the other things that'll be changed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With