How do i make sure lucene gives me back relevant search results when my input string contains terms like c++? Lucene seems to ignore ++ characters.
Code details: When I execute this line,I get a blank search query.
queryField = multiFieldQueryParser.Parse(inpKeywords);
keywordsQuery.Add(queryField, BooleanClause.Occur.SHOULD);
And here is my custom analyzer:
public class CustomAnalyzer : Analyzer
{
private static readonly WhitespaceAnalyzer whitespaceAnalyzer = new WhitespaceAnalyzer();
public override TokenStream TokenStream(String fieldName, System.IO.TextReader reader)
{
TokenStream result = whitespaceAnalyzer.TokenStream(fieldName, reader);
result = new StandardTokenizer(reader);
result = new LowerCaseFilter(result);
result = new StopFilter(result, stop_words);
return result;
}
}
And I'm executing search query this way:
indexSearcher.Search(searchQuery, collector);
I did try queryField = multiFieldQueryParser.Parse(QueryParser.Escape(inpKeywords));,but it still does not work. Here is the query which get executed and returns zero hits. "+(())"
Thanks.
Lucene supports single and multiple character wildcard searches within single terms (not within phrase queries). To perform a single character wildcard search use the "?" symbol. To perform a multiple character wildcard search use the "*" symbol.
To search for a special character that has a special function in the query syntax, you must escape the special character by adding a backslash before it, for example: To search for the string “where?”, escape the question mark as follows: “where\?”
Lucene supports escaping special characters that are part of the query syntax. To escape a special character, precede the character with a backslash ( \ ).
Lucene search is case-sensitive, but all input is usually lowercased when passing through QueryParser, so it feels like it is case insensitive (This is the case of the findBySimpleQuery() method. In other words, don't lowercase your input before indexing, and don't lowercase your queries.
Since, +
is a special character, it needs to be escaped. The list of all characters that need to be escaped is here (See bottom of the page.)
You also need to be careful about the analyzer you use while indexing. For example, StandardAnalyzer will skip +
. You may need to use something like WhiteSpaceAnalyzer while indexing and searching, which will preserve special characters in the tokenstream. Keep in mind that you need to use the same analyzer while indexing and searching.
In addition to choosing the right analyzer, you can use QueryParser.Escape(string s)
to ensure all special characters are properly escaped.
Because this is a static function, you can use it, even if you're using MultiFieldQueryParser.
For example, you can try something like this:
queryField = multiFieldQueryParser.Parse(QueryParser.Escape(inpKeywords));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With