Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

elasticsearch filtering by the size of a field that is an array

How can I filter documents that have a field which is an array and has more than N elements?

How can I filter documents that have a field which is an empty array?

Is facets the solution? If so, how?

like image 437
eran Avatar asked Mar 21 '13 09:03

eran


People also ask

How do I map an array in Elasticsearch?

Elasticsearch does not have an array data type because any field may contain zero or more values by default. Indeed we can index an array of values without defining this within the field's mapping. Please remember that all values within an array must be of the same data type or at least coercion needs to be possible.

What is Elasticsearch query size?

The size parameter is the maximum number of hits to return. Together, these two parameters define a page of results. response = client.

How does filter work in Elasticsearch?

Frequently used filters will be cached automatically by Elasticsearch, to speed up performance. Filter context is in effect whenever a query clause is passed to a filter parameter, such as the filter or must_not parameters in the bool query, the filter parameter in the constant_score query, or the filter aggregation.

What is terms in Elasticsearch?

Elasticsearch provides a way to find a document containing a precise match of a specified term in a document field. Using term and terms query API, you can find documents that match accurate values within a specified field. Let us learn how to use the term and terms queries in Elasticsearch.


2 Answers

I would have a look at the script filter. The following filter should return only the documents that have at least 10 elements in the fieldname field, which is an array. Keep in mind that this could be expensive depending on how many documents you have in your index.

"filter" : {     "script" : {         "script" : "doc['fieldname'].values.length > 10"     } } 

Regarding the second question: do you really have an empty array there? Or is it just an array field with no value? You can use the missing filter to get documents which have no value for a specific field:

"filter" : {     "missing" : { "field" : "user" } } 

Otherwise I guess you need to use scripting again, similarly to what I suggested above, just with a different length as input. If the length is constant I'd put it in the params section so that the script will be cached by elasticsearch and reused, since it's always the same:

"filter" : {     "script" : {         "script" : "doc['fieldname'].values.length > params.param1"         "params" : {             "param1" : 10         }     } } 
like image 127
javanna Avatar answered Nov 08 '22 06:11

javanna


javanna's answer is correct on Elasticsearch 1.3.x and earlier, since 1.4 the default scripting module has changed to groovy (was mvel).

To answer OP's question.

On Elasticsearch 1.3.x and earlier, use this code:

"filter" : {     "script" : {         "script" : "doc['fieldname'].values.length > 10"     } } 

On Elasticsearch 1.4.x and later, use this code:

"filter" : {     "script" : {         "script" : "doc['fieldname'].values.size() > 10"     } } 

Additionally, on Elasticsearch 1.4.3 and later, you will need to enable the dynamic scripting as it has been disabled by default, because of security issue. See: https://www.elastic.co/guide/en/elasticsearch/reference/1.4/modules-scripting.html

like image 30
MicroAleX Avatar answered Nov 08 '22 06:11

MicroAleX