Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch NEST 2 How to correctly map and use nested classes and bulk index

I have three main questions I need help answering.

  1. How do you correctly map and store a nested map?
  2. How do you search a nested part of a document?
  3. How do you bulk index?

I'm using Nest version 2 and have been looking over the new documentation which can be found Here. The documentation has been useful in creating certain parts of the code but unfortunately doesn't explain how they fit together.

Here is the class I'm trying to map.

[ElasticsearchType(Name = "elasticsearchproduct", IdProperty = "ID")]
public class esProduct
{
    public int ID { get; set; }
    [Nested]
    public List<PriceList> PriceList { get; set; }
}

[ElasticsearchType(Name = "PriceList")]
public class PriceList
{
    public int ID { get; set; }
    public decimal Price { get; set; }
}

and my mapping code

var node = new Uri(HOST);
        var settings = new ConnectionSettings(node).DefaultIndex("my-application");

        var client = new ElasticClient(settings);
        var map = new CreateIndexDescriptor("my-application")
                        .Mappings(ms => ms
                            .Map<esProduct>(m => m
                                .AutoMap()
                                .Properties(ps => ps
                                    .Nested<PriceList>(n => n
                                        .Name(c => c.PriceList)
                                        .AutoMap()
                                    )
                                )
                            )
                        );

        var response = client.Index(map);

This is the response I get:

Valid NEST response built from a succesful low level call on POST: /my-application/createindexdescriptor

So that seems to work. next index.

foreach (DataRow dr in ProductTest.Tables[0].Rows)
{
    int id = Convert.ToInt32(dr["ID"].ToString());
    List<PriceList> PriceList = new List<PriceList>();
    DataRow[] resultPrice = ProductPriceTest.Tables[0].Select("ID = " + id);

    foreach (DataRow drPrice in resultPrice)
    {
        PriceList.Add(new PriceList
        {
            ID = Convert.ToInt32(drPrice["ID"].ToString()),
            Price = Convert.ToDecimal(drPrice["Price"].ToString())    
        }

        esProduct product = new esProduct
        {
            ProductDetailID = id,
            PriceList = PriceList
        };

       var updateResponse = client.Update<esProduct>(DocumentPath<esProduct>.Id(id), descriptor => descriptor
                                    .Doc(product)
                                    .RetryOnConflict(3)
                                    .Refresh()
                );

       var index = client.Index(product);
    }
}

Again this seems to work but when I come to search it does seem to work as expected.

var searchResults = client.Search<esProduct>(s => s
                            .From(0)
                            .Size(10)
                                .Query(q => q
                                       .Nested(n => n
                                           .Path(p => p.PriceList)
                                            .Query(qq => qq
                                                .Term(t => t.PriceList.First().Price, 100)
                                                )
                                            )
                                      ));

It does return results but I was expecting

.Term(t => t.PriceList.First().Price, 100)

to look move like

.Term(t => t.Price, 100)

and know that is was searching the nested PriceList class, is this not the case?

In the new version 2 documentation I can't find the bulk index section. I tried using this code

var descriptor = new BulkDescriptor();

***Inside foreach loop***

descriptor.Index<esProduct>(op => op
                            .Document(product)
                            .Id(id)
                            );
***Outside foreach loop***

var result = client.Bulk(descriptor);

which does return a success response but when I search I get no results.

Any help would be appreciated.

UPDATE

After a bit more investigation on @Russ advise I think the error must be with my bulk indexing of a class with a nested object.

When I use

var index = client.Index(product);

to index each product I can use

var searchResults = client.Search<esProduct>(s => s
                    .From(0)
                    .Size(10)
                        .Query(q => q
                        .Nested(n => n
                            .Path(p => p.PriceList)
                            .Query(qq => qq
                                    .Term(t => t.PriceList.First().Price, 100)
                                )
                            )
                        )
                     );

to search and return results, but when I bulk index this no long works but

var searchResults = client.Search<esProduct>(s => s
                    .From(0)
                    .Size(10)
                    .Query(q => q
                            .Term(t => t.PriceList.First().Price, 100)
                          )
                     );

will work, code b doesn't work on the individual index method. Does anyone know why this has happened?

UPDATE 2

From @Russ suggested I have taken a look at the mapping.

the code I'm using to index is

var map = new CreateIndexDescriptor(defaultIndex)
                        .Mappings(ms => ms
                            .Map<esProduct>(m => m
                                .AutoMap()
                                .Properties(ps => ps
                                    .Nested<PriceList>(n => n
                                        .Name(c => c.PriceList)
                                        .AutoMap()
                                    )
                                )
                            )
                        );

        var response = client.Index(map);

Which is posting

http://HOST/fresh-application2/createindexdescriptor {"mappings":{"elasticsearchproduct":{"properties":{"ID":{"type":"integer"},"priceList":{"type":"nested","properties":{"ID":{"type":"integer"},"Price":{"type":"double"}}}}}}}

and on the call to http://HOST/fresh-application2/_all/_mapping?pretty I'm getting

{
  "fresh-application2" : {
    "mappings" : {
      "createindexdescriptor" : {
        "properties" : {
          "mappings" : {
            "properties" : {
              "elasticsearchproduct" : {
                "properties" : {
                  "properties" : {
                    "properties" : {
                      "priceList" : {
                        "properties" : {
                          "properties" : {
                            "properties" : {
                              "ID" : {
                                "properties" : {
                                  "type" : {
                                    "type" : "string"
                                  }
                                }
                              },
                              "Price" : {
                                "properties" : {
                                  "type" : {
                                    "type" : "string"
                                  }
                                }
                              }
                            }
                          },
                          "type" : {
                            "type" : "string"
                          }
                        }
                      },
                      "ID" : {
                        "properties" : {
                          "type" : {
                            "type" : "string"
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

fresh-application2 returned mapping doesn't mention nested type at all, which I'm guessing is the issue.

The mapping my working nested query looks more like this

{
  "my-application2" : {
    "mappings" : {
      "elasticsearchproduct" : {
        "properties" : {
          "priceList" : {
            "type" : "nested",
            "properties" : {
              "ID" : {
                "type" : "integer"
              },
              "Price" : {
                "type" : "double"
              }
            }
          },
          "ID" : {
            "type" : "integer"
          },
        }
      }
    }
  }
}

This has the nested type returned. I think the one which isn't returning nested as a type is when I started using .AutoMap() , am I using it correctly?

UPDATE

I have fixed my mapping problem. I have changed my mapping code to

var responseMap = client.Map<esProduct>(ms => ms
                            .AutoMap()
                            .Properties(ps => ps
                                .Nested<PriceList>(n => n
                                    .Name(c => c.PriceList)
                                .AutoMap()
                                )
                            )
                        );
like image 470
Ben Close Avatar asked Mar 31 '16 10:03

Ben Close


1 Answers

Whilst you're developing, I would recommend logging out requests and responses to Elasticsearch so you can see what is being sent when using NEST; this'll make it easier to relate to the main Elasticsearch documentation and also ensure that the body of the requests and responses match your expectations (for example, useful for mappings, queries, etc).

The mappings that you have look fine, although you can forgo the attributes since you are using fluent mapping; there's no harm in having them there but they are largely superfluous (the type name for the esProduct is the only part that will apply) in this case because .Properties() will override inferred or attribute based mapping that is applied from calling .AutoMap().

In your indexing part, you update the esProduct and then immediately after that, index the same document again; I'm not sure what the intention is here but the update call looks superfluous to me; the index call will overwrite the document with the given id in the index straight after the update (and will be visible in search results after the refresh interval). The .RetryOnConflict(3) on the update will use optimistic concurrency control to perform the update (which is effectively a get then index operation on the document inside of the cluster, that will try 3 times if the version of the document changes in between the get and index). If you're replacing the whole document with an update i.e. not a partial update then the retry on conflict is not really necessary (and as per previous note, the update call in your example looks unnecssary altogether since the index call is going to overwrite the document with the given id in the index).

The nested query looks correct; You specify the path to the nested type and then the query to a field on the nested type will also include the path.I'll update the NEST nested query usage documentation to better demonstrate.

The bulk call looks fine; you may want to send documents in batches e.g. bulk index 500 documents at a time, if you need to index a lot of documents. How many to send in one bulk call is going to depend on a number of factors including the document size, how it is analyzed, performance of the cluster, so will need to experiment to get a good bulk size call for your circumstances.

I'd check to make sure that you are hitting the right index, that the index contains the count of documents that you expect and find a document that you know has a PriceList.Price of 100 and see what is indexed for it. It might be quicker to do this using Sense while you're getting up an running.

like image 174
Russ Cam Avatar answered Oct 02 '22 09:10

Russ Cam