Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find all documents where the value of one field matches that of another

I have two fields within a document with the following mapping:

"field_a": {
    "type": "float"
},
"field_b": {
    "type": "float"
}

How can I find all documents where the value for field_a matches that of field_b? Is this possible with scripting disabled?

like image 797
andrhamm Avatar asked May 08 '15 14:05

andrhamm


2 Answers

Basically you need a script to do it -- this may work even if scripting is disabled because the lucene expressions are fully sandboxed:

GET /index/_search
{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "bool": {
          "must": [
            {
              "script": {
                "lang": "expression",
                "script": "doc['field_a'].value == doc['field_b'].value"
              }
            }
          ]
        }
      }
    }
  }
}
like image 157
Alcanzar Avatar answered Sep 18 '22 08:09

Alcanzar


Is this possible with scripting disabled?

It depends on what you mean by having scripting disabled. If you are running Elasticsearch with the default settings on the latest 1.4 or 1.5 release (currently 1.4.5 and 1.5.2), then you can still use dynamic scripting, but it is limited to sandboxed languages.

Currently, the only built-in and sandboxed option is Lucene Expressions, which is not the default scripting language (Groovy is in both versions, but it is considered unsandboxed).

So, assuming that you have not manually disabled dynamic scripting, then you can still use Lucene Expressions for this purpose, given some caveats:

  1. Lucene Expressions only work with numeric types. In particular, it treats everything as a double.

    • This means that it cannot work with strings currently.
  2. You must manually specify "expression" as the script language.

From there, it's pretty easy to create this script:

GET /my-index/my-type/_search
{
  "query" : {
    "filtered" : {
      "filter" : {
        "script" : {
          "script" : "doc[field_1].value == doc[field_2].value",
          "lang" : "expression",
          "params" : {
            "field_1" : "field_a",
            "field_2" : "field_b"
          }
        }
      }
    }
  }
}

I showed it with the "params" to show that you can reuse the same script for multiple fields. Moreover, you can reuse this with Groovy scripting by using file-based scripting, thereby avoiding dynamic scripting. From there, you can store a Groovy script (e.g., "equal_fields.groovy") as the link shows:

doc[field_1]?.value == doc[field_2]?.value

Then, you can reuse it in a very similar fashion:

GET /my-index/my-type/_search
{
  "query" : {
    "filtered" : {
      "filter" : {
        "script" : {
          "file" : "equal_fields",
          "params" : {
            "field_1" : "field_a",
            "field_2" : "field_b"
          }
        }
      }
    }
  }
}

Note: Long term, that should be "script_file" and not "file", but it is currently not always clean across the different APIs that allow scripts.

like image 40
pickypg Avatar answered Sep 19 '22 08:09

pickypg