Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter values returned on a multivalued field in Solr

Tags:

solr

solr4

I have a document with a field called uuids. This field is a list (multivalued) can have up to 100k values per document.

I want to search for documents that match uuids that start with "5ff6115e" for instance. I can already do it successfully by using q=uuids:5ff6115e*:

http://localhost:8983/solr/test1/select?q=uuids%3A5ff6115e*&rows=1&fl=uuids&wt=json&indent=true

However, the resultant document brings me all 100k values for this field.

What I want is not only filter the documents whose uuids field start with this value, but also filter the field values returned so that I will only receive specific values in the answer.

How to do that?

like image 542
mvallebr Avatar asked Nov 22 '25 01:11

mvallebr


2 Answers

Use highlighting. @Jokin first mentioned it and I feel this is the best answer without hacking on Solr. Try either the PostingsHighlighter or the FastVectorHighlighter, not the default/standard highlighter. Unfortunately both of them internally execute a wildcard query against all UIDS in this field. FVH has the opportunity internally to be smarter about that but it's not implemented that way.

note: if it's within scope to write a little Java to add to Solr, the ideal answer would be to add term vectors (just the terms data in the term-vector, no offsets/positions) and then write a "DocTransformer" to grab the term vector terms; seek to the prefix, then iterate on those that have that prefix. Pretty darned fast.

like image 150
David Smiley Avatar answered Nov 24 '25 21:11

David Smiley


This is not currently possible; see this bug and this previous question.

like image 21
beerbajay Avatar answered Nov 24 '25 21:11

beerbajay



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!