Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch Painless script: How to iterate in an array of Nested Objects

I am trying to create a script using the script_score of the function_score. I have several documents whose rankings field is type="nested". The mapping for the field is:

"rankings": {
        "type": "nested",
        "properties": {
          "rank1": {
            "type": "long"
          },
          "rank2": {
            "type": "float"
          },
          "subject": {
            "type": "text"
          }
        }
      }

A sample document is:

"rankings": [
{
    "rank1": 1051,
    "rank2": 78.5,
    "subject": "s1"
},
{
    "rank1": 45,
    "rank2": 34.7,
    "subject": "s2"
}]

What I want to achieve is to iterate over the nested objects of rankings. Actually, I need to use i.e. a for loop in order to find a particular subject and use the rank1, rank2 to compute something. So far, I use something like this but it does not seem to work (throwing a Compile error):

"function_score": {
"script_score": {
    "script": {
        "lang": "painless",
        "inline": 
                 "sum = 0;"
                 "for (item in doc['rankings_cug']) {"
                     "sum = sum + doc['rankings_cug.rank1'].value;"
                 "}"
         }
    }
}

I have also tried the following options:

  1. for loop using : instead of in: for (item:doc['rankings']) with no success.
  2. for loop using in but trying to iterate over a specific element of the object, i.e. the rank1: for (item in doc['rankings.rank1'].values), which actually compile but it seems that it finds a zero-length array of rank1.

I have read that _source element is the one which can return JSON-like objects, but as far as I found out it is not supported in Search queries.

Can you please give me some ideas of how to proceed with that?

Thanks a lot.

like image 263
christinabo Avatar asked Feb 02 '17 18:02

christinabo


2 Answers

You can access _source via params._source. This one will work:

PUT /rankings/result/1?refresh
{
  "rankings": [
    {
      "rank1": 1051,
      "rank2": 78.5,
      "subject": "s1"
    },
    {
      "rank1": 45,
      "rank2": 34.7,
      "subject": "s2"
    }
  ]
}

POST rankings/_search

POST rankings/_search
{
  "query": {
    "match": {
      "_id": "1"
    }
  },
  "script_fields": {
    "script_score": {
      "script": {
        "lang": "painless",
        "inline": "double sum = 0.0; for (item in params._source.rankings) { sum += item.rank2; } return sum;"
      }
    }
  }
}

DELETE rankings
like image 61
Rahul Singhai Avatar answered Sep 19 '22 18:09

Rahul Singhai


Unfortunately, ElasticSearch scripting in general does not support the ability to access nested documents in this way (including Painless). Perhaps, consider a different structure to your mappings where rankings are stored in multi-valued fields if you need to be able to iterate across them in such a way. Ultimately, the nested data will need to de-normalized and put into the parent documents to be able to gets scores in the way described here.

like image 34
jdconrad Avatar answered Sep 19 '22 18:09

jdconrad