Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch Prefix query not working on nested documents

I'm using a prefix query for an elasticsearch query. It works fine when using it on top-level data, but once applied to nested data there are no results returned. The data I try to query looks as follows:

Here the prefix query works fine: Query:

{ "query": { "prefix" : { "duration": "7"} } }

Result:

{
   "took": 25, ... },
   "hits": {
      "total": 6,
      "max_score": 1,
      "hits": [
         {
        "_index": "itemresults",
        "_type": "itemresult",
        "_id": "ITEM_RESULT_7c8649c2-6cb0-487e-bb3c-c4bf0ad28a90_8bce0a3f-f951-4a01-94b5-b55dea1a2752_7c965241-ad0a-4a83-a400-0be84daab0a9_61",
        "_score": 1,
        "_source": {
           "score": 1,
           "studentId": "61",
           "timestamp": 1377399320017,
           "groupIdentifiers": {},
           "assessmentItemId": "7c965241-ad0a-4a83-a400-0be84daab0a9",
           "answered": true,
           "duration": "7.078",
           "metadata": {
              "Korrektur": "a",
              "Matrize12_13": "MA.1.B.1.d.1",
              "Kompetenz": "ZuV",
              "Zyklus": "Z2",
              "Schwierigkeit": "H",
              "Handlungsaspekt": "AuE",
              "Fach": "MA",
              "Aufgabentyp": "L"
           },
           "assessmentSessionId": "7c8649c2-6cb0-487e-bb3c-c4bf0ad28a90",
           "assessmentId": "8bce0a3f-f951-4a01-94b5-b55dea1a2752"
        }
     },

Now trying to use the prefix query to apply on the nested structure 'metadata' doesn't return any result:

{ "query": { "prefix" : { "metadata.Fach": "M"} } }

Result:

{
   "took": 18,
   "timed_out": false,
   "_shards": {
      "total": 15,
      "successful": 15,
      "failed": 0
   },
   "hits": {
      "total": 0,
      "max_score": null,
      "hits": []
   }
}

What am I doing wrong? Is it at all possible to apply prefix on nested data?

like image 834
paweloque Avatar asked Feb 16 '23 11:02

paweloque


1 Answers

It does not depends whether is nested or not. It depends on your mapping, if you are analyzing the string at index time or not.

I'm going to put an example:

I've created and index with the following mapping:

curl -XPUT 'http://localhost:9200/test/' -d '
{
  "mappings": {

    "test" : {
      "properties" : {
        "text_1" : {
           "type" : "string",
           "index" : "analyzed"
        },
        "text_2" : {
          "index": "not_analyzed",
           "type" : "string"
        }
      }
    }
  }
}'

Basically 2 text fields, one analyzed and the other not_analyzed. Now I index the following document:

curl -XPUT 'http://localhost:9200/test/test/1' -d '
{
"text_1" : "Hello world",
"text_2" : "Hello world"
}'

text_1 query

As text_1 is analyzed one of the things that elasticsearch does is to convert the field into lower case. So if I make the following query it doesn't find any document:

curl -XGET 'http://localhost:9200/test/test/_search?pretty=true' -d '
{ "query": { "prefix" : { "text_1": "H"} } }
'
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

But if I do the trick and use lower case for making the query:

curl -XGET 'http://localhost:9200/test/test/_search?pretty=true' -d '
{ "query": { "prefix" : { "text_1": "h"} } }
'
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "test",
      "_type" : "test",
      "_id" : "1",
      "_score" : 1.0, "_source" :
{
"text_1" : "Hello world",
"text_2" : "Hello world"
}
    } ]
  }
}

text_2 query

As text_2 is not analyzed, when I make the original query it matches:

curl -XGET 'http://localhost:9200/test/test/_search?pretty=true' -d '
{ "query": { "prefix" : { "text_2": "H"} } }
'
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "test",
      "_type" : "test",
      "_id" : "1",
      "_score" : 1.0, "_source" :
{
"text_1" : "Hello world",
"text_2" : "Hello world"
}
    } ]
  }
}
like image 60
moliware Avatar answered Feb 18 '23 20:02

moliware