Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solr 6.6.2 Grouped Query

Tags:

solr

grouping

With having the following setup on Solr 6.6.2:

A Solr cloud collection with documents having the fields ID, ContactId, Properties up and running and unique key on id.

There can be multiple documents with the same ContactId.

Each of the contact documents has a text field properties containing a line of text. Properties field is indexed with separation by ',' so that e.g. Properties:Green hits.

For example:

+----+-----------+--------------+
| ID | ContactId |  Properties  |
+----+-----------+--------------+
|  1 | C1        | Blue,Green   |
|  2 | C1        | Blue,Yellow  |
|  3 | C2        | Green,Yellow |
+----+-----------+--------------+

Now I need to find all ContactIds where Properties has "Green" AND "Yellow" where it is allowed that this query matches over all documents of this ContactID. So the result would be in that case C1, C2.

I tried to group the results but still I am not able to query on the grouped result.

group=true&group.field=ContactId&group.query=(Green AND Yellow)&q=(Green OR Yellow)

The idea I followed was query(q) for getting all documents which has either Green OR Yellow than do the grouping on the group.field ContactId and afterwards the group.query with AND Condition of Green AND Yellow. But that did not succeed.

In mySql one would do just a

group_concat(Properties) as grouped 

and do a like over that string:

grouped LIKE '%Green%' AND grouped LIKE '%Yellow%'

How can I achieve this query on the Solr index?

Tried so far as suggested with quotes and without:

intersect(  
    search(w3, q=Properties:("Green"), fl="ContactId", sort="ContactId asc"),  
    search(w3, q=Properties:("Yellow"), fl="ContactId", sort="ContactId asc"),  
    on="ContactId" )

derived from solr examples of intersect:

intersect(  
    search(w3, q=Properties:("Green" OR "Green" AND "Yellow"), fl="ContactId", sort="ContactId asc"),  
    search(w3, q=Properties:("Yellow" OR "Green" AND "Yellow"), fl="ContactId", sort="ContactId asc"),  
    on="ContactId" )

But still only results come up where both properties are inside the same document and not those where each of those are split over multiple documents of the same ContactId (Only C2 in that case, but not C1).

like image 629
gantners Avatar asked Nov 07 '22 13:11

gantners


1 Answers

You can do this by using a Streaming Expression, and fetching the documents contained in the intersection between both your queries (i.e. one query matches Yellow, one matches Green):

intersect(
  search(collection, q=Properties:Yellow, fl="ContactId", sort="ContactId asc"),
  search(collection, q=Properties:Green, fl="ContactId", sort="ContactId asc"),
  on="ContactId"
)

You give a Streaming Expression through the expr parameter to the /stream request handler. You can also test it directly (without expr=) under "Stream" in the Solr admin interface for your collection.

Other than that, your MySQL example wouldn't really do the same, as it'd include any element that had the text present somewhere - so "Dark Green" would have given a false positive.

like image 73
MatsLindh Avatar answered Nov 15 '22 08:11

MatsLindh