Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene queryparser with "/" in query criteria

When I try to search for something such as "workaround/fix" within Lucene, it throws this error:

org.apache.lucene.queryparser.classic.ParseException: Cannot parse 'workaround/fix': Lexical error at line 1, column 15.  Encountered: <EOF> after : "/fix"
    at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:131)
    at pi.lucengine.LucIndex.main(LucIndex.java:112)
Caused by: org.apache.lucene.queryparser.classic.TokenMgrError: Lexical error at line 1, column 15.  Encountered: <EOF> after : "/fix"
    at org.apache.lucene.queryparser.classic.QueryParserTokenManager.getNextToken(QueryParserTokenManager.java:1133)
    at org.apache.lucene.queryparser.classic.QueryParser.jj_scan_token(QueryParser.java:599)
    at org.apache.lucene.queryparser.classic.QueryParser.jj_3R_2(QueryParser.java:482)
    at org.apache.lucene.queryparser.classic.QueryParser.jj_3_1(QueryParser.java:489)
    at org.apache.lucene.queryparser.classic.QueryParser.jj_2_1(QueryParser.java:475)
    at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:226)
    at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:181)
    at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:170)
    at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:121)

This are my lines 111 and 112:

QueryParser parser = new QueryParser(Version.LUCENE_43, field, analyzer);
Query query = parser.parse(newLine);

What do I need to do to allow it to parse the "/"?

like image 570
abitnew Avatar asked Jul 22 '13 22:07

abitnew


People also ask

How do you use the wildcard in Lucene?

Lucene supports single and multiple character wildcard searches within single terms (not within phrase queries). To perform a single character wildcard search use the "?" symbol. To perform a multiple character wildcard search use the "*" symbol. You can also use the wildcard searches in the middle of a term.

Why is Lucene so fast?

Why is Lucene faster? Lucene is very fast at searching for data because of its inverted index technique. Normally, datasources structure the data as an object or record, which in turn have fields and values.

What are Lucene special characters?

Special characters( ) { } [ ] ^ “ ~ * ? : \ / are reserved for the Lucene Query String parser, so you'll need to escape them with \ before the character if you need to use it. For example, f-150 should be wrapped up as f\-150 , or wrapped inside double quotes as "f-150" .


2 Answers

The query parser interprets slashes as the beginning/end or a regex query (as of 4.0, see documentation here).

So, to incorporate slashes into the query, you will need to escape them by adding a backslash (\) before them.

You can handle escaping with QueryParser.escape(String).

like image 61
femtoRgon Avatar answered Oct 12 '22 23:10

femtoRgon


I encountered a similar problem when using '/' in lucene queries issued from the elastic search kibana dashboard. I was escaping the '/' characters as indicated in the documentation and still not getting any success. I think this is related to the template bug reported here : https://github.com/elastic/kibana/issues/789. Not sure yet, will update when we update the logstash components

like image 27
zayquan Avatar answered Oct 13 '22 01:10

zayquan