Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to specify ElasticSearch copy_to order?

ElasticSearch has the ability to copy values to other fields (at index time), enabling you to search on multiple fields as if it were one field (Core Types: copy_to).

However, there doesn't seem to be any way to specify the order in which these values should be copied. This could be important when phrase matching:

curl -XDELETE 'http://10.11.12.13:9200/helloworld'
curl -XPUT 'http://10.11.12.13:9200/helloworld'
# copy_to is ordered alphabetically!
curl -XPUT 'http://10.11.12.13:9200/helloworld/_mapping/people' -d '
{
  "people": {
    "properties": {
      "last_name": {
        "type": "string",
        "copy_to": "full_name"
      },
      "first_name": {
        "type": "string",
        "copy_to": "full_name"
      },
      "state": {
        "type": "string"
      },
      "city": {
        "type": "string"
      },
      "full_name": {
        "type": "string"
      }
    }
  }
}
'

curl -X POST "10.11.12.13:9200/helloworld/people/dork" -d '{"first_name": "Jim", "last_name": "Bob", "state": "California", "city": "San Jose"}'
curl -X POST "10.11.12.13:9200/helloworld/people/face" -d '{"first_name": "Bob", "last_name": "Jim", "state": "California", "city": "San Jose"}'


curl "http://10.11.12.13:9200/helloworld/people/_search" -d '
{
  "query": {
    "match_phrase": {
      "full_name": {
        "query":    "Jim Bob"
      }
    }
  }
}
'

Only "Jim Bob" is returned; it seems that the fields are copied in field-name alphabetical order.

How would I switch the copy_to order such that the "Bob Jim" person would be returned?

like image 550
Tim Harper Avatar asked Apr 09 '15 09:04

Tim Harper


1 Answers

This is more deterministically controlled by registering a transform script in your mapping.

something like this:

"transform" : [
    {"script": "ctx._source['full_name'] = [ctx._source['first_name'] + " " + ctx._source['last_name'], ctx._source['last_name'] + " " + ctx._source['first_name']]"}
]

Also, transform scripts can be "native", i.e. java code, made available to all nodes in the cluster by making your custom classes available in the elasticsearch classpath and registered as native scripts by the settings:

script.native.<name>.type=<fully.qualified.class.name>

in which case in your mapping you'd register the native script as a transform like so:

"transform" : [
    {
        "script" : "<name>",
        "params" : {
            "param1": "val1",
            "param2": "val2"
        },
        "lang": "native"
    }
],
like image 80
Aditya Chadha Avatar answered Nov 13 '22 01:11

Aditya Chadha