Issue: how do we provide hibernate search with a raw lucene query string that includes numeric and non-numeric fields?
Background: we recently upgraded to HibernateSearch 5.0 and many of our queries are now failing because of a change in the HibernateSearch Query Parser (pre-lucene) with the following error:
The specified query contains a string based sub query which targets the numeric encoded field(s)
In most cases, we use lucene's text syntax along with a MultiFieldQueryParser
to pass queries into HibernateSearch due to the complexity of the queries that we're running. Up until HibernateSearch 5.0, these worked quite well. In upgrading, we've encountered exceptions thrown from HibernateSearch that prevent our app from running queries that used to work. We don't understand why the exceptions are being thrown or the best way to move forward.
In trying to track down the issue, I've tried to simplify what works and what doesn't in the most raw form. (this is built of HibernateSearch's QueryValidationTest).
Examples:
Given the following Entity class:
@Entity
@Indexed
public static class B {
@Id
@GeneratedValue
private long id;
@Field
private long value;
@Field
private String text;
}
Test 1 (how we write queries for hibernate search: FAILURE):
QueryParser parser = new MultiFieldQueryParser(new String[]{"id","value","num"},new StandardAnalyzer());
Query query = parser.parse("+(value:1 text:test)");
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( query, B.class );
fullTextQuery.list();
results in:
org.hibernate.search.exception.SearchException: HSEARCH000233: The specified query '+(value:1 text:test)' contains a string based sub query which targets the numeric encoded field(s) 'value'. Check your query or try limiting the targeted entities.
at org.hibernate.search.query.engine.impl.LazyQueryState.validateQuery(LazyQueryState.java:163)
at org.hibernate.search.query.engine.impl.LazyQueryState.search(LazyQueryState.java:102)
at org.hibernate.search.query.engine.impl.QueryHits.updateTopDocs(QueryHits.java:227)
at org.hibernate.search.query.engine.impl.QueryHits.<init>(QueryHits.java:122)
at org.hibernate.search.query.engine.impl.QueryHits.<init>(QueryHits.java:94)
at org.hibernate.search.query.engine.impl.HSQueryImpl.getQueryHits(HSQueryImpl.java:436)
at org.hibernate.search.query.engine.impl.HSQueryImpl.queryEntityInfos(HSQueryImpl.java:257)
at org.hibernate.search.query.hibernate.impl.FullTextQueryImpl.list(FullTextQueryImpl.java:200)
at org.hibernate.search.test.query.validation.QueryValidationTest.testRawLuceneWithNumericValue(QueryValidationTest.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.hibernate.testing.junit4.ExtendedFrameworkMethod.invokeExplosively(ExtendedFrameworkMethod.java:62)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.hibernate.testing.junit4.FailureExpectedHandler.evaluate(FailureExpectedHandler.java:58)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.hibernate.testing.junit4.BeforeClassCallbackHandler.evaluate(BeforeClassCallbackHandler.java:43)
at org.hibernate.testing.junit4.AfterClassCallbackHandler.evaluate(AfterClassCallbackHandler.java:42)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Test 2: (Using a numeric range variation fails the same way: FAILURE):
QueryParser parser = new MultiFieldQueryParser(new String[]{"id","value","text"},new StandardAnalyzer());
Query query = parser.parse("+(value:[1 TO 1] text:test)");
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( query, B.class );
fullTextQuery.list();
Test 3: (using lucene Terms: SUCCESS)
TermQuery query = new TermQuery( new Term("text", "bar") );
TermQuery nq = new TermQuery( new Term("value", "1") );
BooleanQuery bq = new BooleanQuery();
bq.add(query, Occur.SHOULD);
bq.add(nq, Occur.SHOULD);
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( bq, B.class );
note: a full version of the test cases with tests that illustrate what we're seeing is here: https://github.com/abrin/hibernate-search/blob/3fdcc8229f0bfa00329b9d977172fd218d82cac2/orm/src/test/java/org/hibernate/search/test/query/validation/QueryValidationTest.java
thanks
First off, the reason for your problem is that as of Search 5, numeric types are indexed as Lucene numeric fields (as opposed to string-based fields). Apart from performance gains, it also allows, for example, to sort on numeric fields without the need for padding. The Search 5 documentation says the following:
Prior to Search 5, numeric field encoding was only chosen if explicitly requested via @NumericField. As of Search 5 this encoding is automatically chosen for numeric types. To avoid numeric encoding you can explicitly specify a non numeric field bridge via @Field.bridge or @FieldBridge. The package org.hibernate.search.bridge.builtin contains a set of bridges which encode numbers as strings, for example org.hibernate.search.bridge.builtin.IntegerBridge.
So, if you want to stick to your old behaviour you need to make sure that your numeric values are still indexed as strings. In your example value
needs to be indexed with org.hibernate.search.bridge.builtin.LongBridge
. You can achieve this with the @FieldBridge
annotation (you can ignore the id case, since document ids are indexed as strings anyway):
@Field
@FieldBridge(impl = LongBridge.class)
private long value;
Some comments regarding your test scenarios:
NumericRangeQuery
. If you still want to use a query parser you need to provide your own subclass and handle numeric fields yourself. See also - How do I make the QueryParser in Lucene handle numeric ranges?
value:[1 TO 1]
, it just creates a text/string range query.value
term is ignored. A TermQuery
is string based and won't be able to find matches in an numerically encoded field. See also Lucene 3.0.3 Numeric term query
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With