Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch: Querying a field that's an array of objects

I've got data indexed using ElasticSearch, and I'm having trouble querying a particular field. A snippet of the JSON is as follows:

 {
 "_index": "indexName",
 "_type": "type",
 "_id": "00001",
 "color": "red",
 "place": "london",
 "person": [
      {
           "name": "john",
           "friends": [
               "mary",
               "jane"
           ]
      }
      {
           "name": "jack",
           "friends": [
               "lisa",
               "alex"
           ]
      }

 ]
 }

I need to query the index and pick out all records where one of the names inside person is "john".

I'm using Client.Search to do this, and I've had no trouble querying the fields that aren't nested (like color) by using:

 var searchResults = client.Search<People>(s => s
            .Index("indexName")
            .Type("type")
            .Query(q => q
                .Bool(b => b
                    .Must(
                        x => x.Match(m => m.OnField(p => p.color).Query("red")),
                        x => x.Match(m => m.OnField(p => p.place).Query("london"))))));

I've got People defined as follows:

public class People
{
    public string color {get; set; }
    public string place {get; set; }
    public List<Person> person {get; set; }
}
public class Person
{
    public string name {get; set; }
    // "friends" isn't here as I don't pull data from it
}

I'm unsure as to how to query on name as it's "inside" people - any help is greatly appreciated.

like image 420
helencrump Avatar asked Sep 02 '15 16:09

helencrump


People also ask

How do I capture a specific field in Elasticsearch?

There are two recommended methods to retrieve selected fields from a search query: Use the fields option to extract the values of fields present in the index mapping. Use the _source option if you need to access the original data that was passed at index time.

What is nested type in Elasticsearch?

The nested type is a specialised version of the object data type that allows arrays of objects to be indexed in a way that they can be queried independently of each other.

What is analyzed field in Elasticsearch?

Text field typeedit These fields are analyzed , that is they are passed through an analyzer to convert the string into a list of individual terms before being indexed. The analysis process allows Elasticsearch to search for individual words within each full text field.


2 Answers

You need to wrap query in nested_query to have access to nested fields.

{
    "nested" : {
        "path" : "person",
        "query" : {
             "match" : {"person.name" : "john"}
        }
    }
}

Exceprt from documentation:

The query is executed against the nested objects / docs as if they were indexed as separate docs (they are, internally) and resulting in the root parent doc (or parent nested mapping).

Basically internally nested fields are stored as separate documents nearby (so they are quick to fetch) the original document. By default elastic doesn't load them, so you need to explicitly tell him that you want to access it. You could say nested fields are lazy ;)

Sorry It's been a long time since I worked on .Net and Linq. Don't know the API. But you need to create something like that.

Edit. From github source and your code I think you need to:

var s = new SearchDescriptor<People>()
                .Query(ff=>ff
                    .Nested(n=>n
                        .Path(f=>f.person[0])
                        .Query(q=>q.Term(f=>f.person[0].name,"john"))
                    )
                );

Edit2. Did you try direct curl to server? Or try query in head plugin? Something like:

curl -XPOST 'http://localhost:9202/indexName' -d '
{
  "query": {
    "nested": {
      "path": "person",
      "query": {
        "query_string": {
          "query": "person.name: john"
        }
      }
    }
  }
}'

This works on my cluster (with changed column names).

like image 184
slawek Avatar answered Nov 14 '22 22:11

slawek


After a long while, I finally figured out that my data wasn't actually indexed as nested in the first place, and so simply adding

.Term("person.name", "john")

to my query worked perfectly.

like image 23
helencrump Avatar answered Nov 14 '22 22:11

helencrump