Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene 5 Sort problems (UninvertedReader and DocValues)

I am working on a Search Engine built in Lucene 5.2.1, but I am having trouble with Sort updated option for Search. I get an error while searching with the Sort option:

Exception in thread "main" java.lang.IllegalStateException: unexpected docvalues type NONE for field 'stars' (expected=NUMERIC). Use UninvertingReader or index with docvalues.
at org.apache.lucene.index.DocValues.checkField(DocValues.java:208)
at org.apache.lucene.index.DocValues.getNumeric(DocValues.java:227)
at org.apache.lucene.search.FieldComparator$NumericComparator.getNumericDocValues(FieldComparator.java:167)
at org.apache.lucene.search.FieldComparator$NumericComparator.doSetNextReader(FieldComparator.java:153)
at org.apache.lucene.search.SimpleFieldComparator.getLeafComparator(SimpleFieldComparator.java:36)
at org.apache.lucene.search.FieldValueHitQueue.getComparators(FieldValueHitQueue.java:183)
at org.apache.lucene.search.TopFieldCollector$NonScoringCollector.getLeafCollector(TopFieldCollector.java:141)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:762)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:485)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:694)
at org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:679)
at org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:621)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:527)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:577)
at SearchEngine.searchBusinessByCategory(SearchEngine.java:145)
at Tests.searchEngineTest(Tests.java:137)
at Main.main(Main.java:13)

I have indexed a document which contains a double field "stars" that I need to use to sort search results (the document with highest "stars" value needs to be on top).

Document doc = new Document();
doc.add(new DoubleField("stars", stars, Store.YES));

Then I implemented a Search command with Sort option on that "stars" field as follows:

SortField sortfield = new SortField("stars", SortField.Type.DOUBLE, true);
Sort sort = new Sort(sortfield);
mySearcher.search(query, maxdocs, sort);

I have found similar discussions online but they all talk about SolR (or sometimes Elastic Search) but they don't provide neither a Lucene snippet nor a more general implementative solution strategy for Lucene 5. For example:

Solr uses DocValues and falls back to wrapping with UninvertingReader, if user have not indexed them (with negative startup performance and memory effects). But in general, you should really enable DocValues for fields you want to sort on. (from http://grokbase.com/t/lucene/java-user/152q2jcgzz/lucene-4-x-5-illegalstateexception-while-sorting)

I see that I have to do something with UninvertingReader or DocValues. But what?

like image 400
mazerone Avatar asked Feb 09 '23 13:02

mazerone


2 Answers

You need to define a custom FieldType since the DoubleField constructor you are using does not store docvalues.

Ex:

private static final FieldType DOUBLE_FIELD_TYPE_STORED_SORTED = new FieldType();
static {
    DOUBLE_FIELD_TYPE_STORED_SORTED.setTokenized(true);
    DOUBLE_FIELD_TYPE_STORED_SORTED.setOmitNorms(true);
    DOUBLE_FIELD_TYPE_STORED_SORTED.setIndexOptions(IndexOptions.DOCS);
    DOUBLE_FIELD_TYPE_STORED_SORTED
        .setNumericType(FieldType.NumericType.DOUBLE);
    DOUBLE_FIELD_TYPE_STORED_SORTED.setStored(true);
    DOUBLE_FIELD_TYPE_STORED_SORTED.setDocValuesType(DocValuesType.NUMERIC);
    DOUBLE_FIELD_TYPE_STORED_SORTED.freeze();
}

And add to your doc using:

Document doc = new Document();
doc.add(new DoubleField("stars", stars, DOUBLE_FIELD_TYPE_STORED_SORTED));

When you use

new DoubleField("stars", stars, Stored.YES);

You are actually using this FieldType:

public static final FieldType TYPE_STORED = new FieldType();
static {
    TYPE_STORED.setTokenized(true);
    TYPE_STORED.setOmitNorms(true);
    TYPE_STORED.setIndexOptions(IndexOptions.DOCS);
    TYPE_STORED.setNumericType(FieldType.NumericType.DOUBLE);
    TYPE_STORED.setStored(true);
    TYPE_STORED.freeze();
}

which does not have DocValues.

like image 94
user1071777 Avatar answered Feb 23 '23 20:02

user1071777


For future reference, here is a shortcut for the answer given by @user1071777 :

private static final FieldType DOUBLE_FIELD_TYPE_STORED_SORTED = new FieldType(LongField.TYPE_STORED);

static {
    DOUBLE_FIELD_TYPE_STORED_SORTED.setDocValuesType(DocValuesType.NUMERIC);
    DOUBLE_FIELD_TYPE_STORED_SORTED.freeze();
}

The constructor which receives a fieldType will copy all of the properties already set on stored type.

like image 38
buftlica Avatar answered Feb 23 '23 20:02

buftlica