Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Querystring search on array elements in Elastic Search

I'm trying to learn elasticsearch with a simple example application, that lists quotations associated with people. The example mapping might look like:

{ 
  "people" : {
    "properties" : {
      "name" : { "type" : "string"},
      "quotations" : { "type" : "string" }
    }
  }
}

Some example data might look like:

{ "name" : "Mr A",
  "quotations" : [ "quotation one, this and that and these"
                 , "quotation two, those and that"]
}

{ "name" : "Mr B",
  "quotations" : [ "quotation three, this and that"
                 , "quotation four, those and these"]
}

I would like to be able to use the querystring api on individual quotations, and return the people who match. For instance, I might want to find people who have a quotation that contains (this AND these) - which should return "Mr A" but not "Mr B", and so on. How can I achieve this?

EDIT1:

Andrei's answer below seems to work, with data values now looking like:

{"name":"Mr A","quotations":[{"value" : "quotation one, this and that and these"}, {"value" : "quotation two, those and that"}]}

However, I can't seem to get a query_string query to work. The following produces no results:

{
  "query": {
    "nested": {
      "path": "quotations",
      "query": {
        "query_string": {
            "default_field": "quotations",
            "query": "quotations.value:this AND these"
        }
      }
    }
  }
}

Is there a way to get a query_string query working with a nested object?

Edit2: Yes it is, see Andrei's answer.

like image 209
oneway Avatar asked Oct 08 '14 13:10

oneway


People also ask

How do you search in Elasticsearch?

You can use the search API to search and aggregate data stored in Elasticsearch data streams or indices. The API's query request body parameter accepts queries written in Query DSL. The following request searches my-index-000001 using a match query. This query matches documents with a user.id value of kimchy .

How do I search multiple fields in Elasticsearch?

One of the most common queries in elasticsearch is the match query, which works on a single field. And there's another query with the very same options that works also on multiple fields, called multi_match. These queries support text analysis and work really well.

What is bool query in Elasticsearch?

Boolean, or a bool query in Elasticsearch, is a type of search that allows you to combine conditions using Boolean conditions. Elasticsearch will search the document in the specified index and return all the records matching the combination of Boolean clauses.

Can we store an array in Elasticsearch?

In Elasticsearch, there is no dedicated array data type. Any field can contain zero or more values by default, however, all values in the array must be of the same data type.


2 Answers

For that requirement to be achieved, you need to look at nested objects, not to query a flattened list of values but individual values from that nested object. For example:

{
  "mappings": {
    "people": {
      "properties": {
        "name": {
          "type": "string"
        },
        "quotations": {
          "type": "nested",
          "properties": {
            "value": {
              "type": "string"
            }
          }
        }
      }
    }
  }
}

Values:

{"name":"Mr A","quotations":[{"value": "quotation one, this and that and these"}, {"value": "quotation two, those and that"}]}
{"name":"Mr B","quotations":[{"value": "quotation three, this and that"}, {"value": "quotation four, those and these"}]}

Query:

{
  "query": {
    "nested": {
      "path": "quotations",
      "query": {
        "bool": {
          "must": [
            { "match": {"quotations.value": "this"}},
            { "match": {"quotations.value": "these"}}
          ]
        }
      }
    }
  }
}
like image 70
Andrei Stefan Avatar answered Oct 16 '22 06:10

Andrei Stefan


Unfortunately there is no good way to do that. https://web.archive.org/web/20141021073225/http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/complex-core-fields.html

When you get a document back from Elasticsearch, any arrays will be in the same order as when you indexed the document. The _source field that you get back contains exactly the same JSON document that you indexed.

However, arrays are indexed — made searchable — as multi-value fields, which are unordered. At search time you can’t refer to “the first element” or “the last element”. Rather think of an array as a bag of values.

In other words, it is always considering all values in the array.

This will return only Mr A

{
  "query": {
    "match": {
      "quotations": {
        "query": "quotation one",
        "operator": "AND"
      }
    }
  }
}

But this will return both Mr A & Mr B:

{
  "query": {
    "match": {
      "quotations": {
        "query": "this these",
        "operator": "AND"
      }
    }
  }
}
like image 33
evanwong Avatar answered Oct 16 '22 05:10

evanwong