Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to query for a specific document by _id using the elasticsearch Nest client

I have a specific document that I want to retrieve. The id value was assigned by elastic search and therefore does not appear in the _source section of the document.

I believe there should be an Ids function but I can't find it in the NEST documentation. The following results in: Cannot convert lambda expression to type 'Id' because it is not a delegate type

var queryResponse = 
  client.Search<Dictionary<string, object>>(
    s => s.Query(
      q => q.Ids( 
        i => i.Values(v => "_id_assigned_by_elastic")
      )
    )
  ).Hits.FirstOrDefault();

Dictionary<string,object> doc = h.Source;

The Rest API docs show this example:

{
  "query": {
    "ids" : {
      "values" : ["1", "4", "100"]
    }
  }
}

There is no corresponding example for C# and the NEST client

like image 867
schmidlop Avatar asked Sep 16 '20 21:09

schmidlop


1 Answers

When no id is specified when indexing a document into Elasticsearch, Elasticsearch will autogenerate an id for the document. This id will be returned in the index response and is part of the document's metadata. In contrast, the JSON document sent to Elasticsearch will be persisted as the document's _source.

Assuming that JSON documents are modelled with the following POCO

public class MyDocument
{
    public string Property1 { get; set; }
}

To get the id of the document when indexing into Elasticsearch using Nest

var client = new ElasticClient();

var document = new MyDocument
{
    Property1 = "foo"
};

var indexResponse = client.Index(document, i => i.Index("my_documents"));

var id = indexResponse.Id;

With the id, a document can be retrieved with the Get API

var getResponse = client.Get<MyDocument>(id, g => g.Index("my_documents"));
    
var fetchedDocument = getResponse.Source;

getResponse contains the document metadata such as index, sequence number, routing, etc. in addition to the source.

There's also the Source API that can be used to retrieve just the document _source

var sourceResponse = client.Source<MyDocument>(id, g => g.Index("my_documents"));
    
var fetchedDocument = sourceResponse.Body;

If you want to retrieve a number of documents by id, you can use the MultiGet API

var ids = new long[] { 1, 2, 3 };

var multiGetResponse = client.MultiGet(m => m
    .Index("my_documents")
    .GetMany<MyDocument>(ids, (g, id) => g.Index(null))
);


var fetchedDocuments = multiGetResponse.GetMany<MyDocument>(ids).Select(h => h.Source);

Multi Get API can target documents across different indices which may map to different POCOs in your application.

Finally, if you want to filter by a subset of document ids whilst searching, you can use the Ids query

var ids = new long[] { 1, 2, 3 };

var multiGetResponse = client.Search<MyDocument>(s => s
    .Index("my_documents")
    .Query(q => q
        .Ids(i => i
            .Values(ids)
        )
    )
);

Note that Get, Source and MultiGet APIs can retrieve documents immediately after they're indexed. In contrast, an indexed document will show up in search results only after the index has been refreshed.

like image 152
Russ Cam Avatar answered Nov 02 '22 21:11

Russ Cam