We are considering a schema with two multi-valued fields. Search is performed on the first field, but sorting should be done on the second field, using the corresponding value. E.g. if documents match because of the n-th value in the first field (where n may be different for each match), then they should be returned sorted by the n-th value in the second field.
Is that possible?
Background: each document has a list of similar documents (IDs) and a corresponding list of similarity scores (value between 0 and 1). Given ID 42, we need to return all similar documents (e.g. documents with 42 in the first field), sorted by their similarity to document 42.
Other schemas we are considering are:
The approach will not succeed, as you can search, but you cannot sort by a multivalued field. This pointed out in Sorting with Multivalued Field in Solr and written in Solr's Wiki
Sorting can be done on the "score" of the document, or on any multiValued="false" indexed="true" field provided that field is either non-tokenized (ie: has no Analyzer) or uses an Analyzer that only produces a single Term (ie: uses the KeywordTokenizer)
Update
About the alternatives, as you point out that you need to find similar documents for one given ID, why not create a second core with a schema like
<fields>
<field name="doc_id" type="int" indexed="true" stored="true" />
<field name="similar_to_id" type="int" indexed="true" stored="true" />
<field name="similarity" type="string" indexed="true" stored="true" />
</fields>
<types>
<fieldType name="int" class="solr.TrieIntField"/>
<fieldType name="string" class="solr.StrField" />
</types>
Then you could do a second query, after performing the actual search
q=similar_to_id=42&sort=similarity
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With