Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nest: how are you supposed to deal with the highlights in c#

I'm trying to search "everything" in an index for a search term, and display the context with the terms highlighted. I get an appropriate set of documents returned, but cannot figure out how I'm supposed to handle the highlighting in code.

At this point i'm just trying to dump it into a literal, and the below code "kinda sorta" works, but it doesn't seem to have highlights for every document, and just doesn't feel right. I have found many examples on how to do the query with highlights, but i haven't found any example of how to do anything with displaying the results. Any suggestions? Thanks!

    var searchResults = client.Search<Document>(s => s.Query(qs => qs.QueryString(q => q.Query(stringsearch))).Highlight(h => h
            .PreTags("<b>")
            .PostTags("</b>")
            .OnFields(
              f => f
                .OnField("*")
                .PreTags("<em>")
                .PostTags("</em>")
            )
        ));

    Literal1.Text = "";

    foreach(var h in searchResults.Hits)
    {
        foreach(var hh in h.Highlights)
        {
            foreach(var hhh in hh.Value.Highlights)
            {
                Literal1.Text += hhh+@"<br>";
            }
        }
    }
like image 433
Bill French Avatar asked Mar 15 '15 02:03

Bill French


1 Answers

Edit: The solution below is only tested on ElasticSearch 2.x, not ElasticSearch 5.x/6.x

The highlights can either be accessed in searchResults.Highlights (for all highlights), or in the IHit<T>.Highlights for that hit.

Is this along the lines of what you're trying to achieve ?

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using Elasticsearch.Net.ConnectionPool;
using Nest;

namespace ESTester
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            const string indexName = "testindex";
            var connectionSettings = new ConnectionSettings(new SingleNodeConnectionPool(new Uri("http://127.0.0.1:9200")));
            var client = new ElasticClient(connectionSettings);

            var existResponse = client.IndexExists(descriptor => descriptor.Index(indexName));
            if (existResponse.Exists)
                client.DeleteIndex(descriptor => descriptor.Index(indexName));

            // Making sure the refresh interval is low, since it's boring to have to wait for things to catch up
            client.PutTemplate("", descriptor => descriptor.Name("testindex").Template("testindex").Settings(objects => objects.Add("index.refresh_interval", "1s")));

            client.CreateIndex(descriptor => descriptor.Index(indexName));

            var docs = new List<Document>
            {
                new Document{Text = "This is the first document" },
                new Document{Text = "This is the second document" },
                new Document{Text = "This is the third document" }
            };

            var bulkDecsriptor = new BulkDescriptor().IndexMany(docs, (descriptor, document) => descriptor.Index(indexName));
            client.Bulk(bulkDecsriptor);

            // Making sure ES has indexed the documents
            Thread.Sleep(TimeSpan.FromSeconds(2));

            var searchDescriptor = new SearchDescriptor<Document>()
                .Index(indexName)
                .Query(q => q
                    .Match(m => m
                        .OnField(d => d.Text)
                        .Query("the second")))
                .Highlight(h => h
                    .OnFields(f => f
                        .OnField(d => d.Text)
                        .PreTags("<em>")
                        .PostTags("</em>")));

            var result = client.Search<Document>(searchDescriptor);

            if (result.Hits.Any())
            {
                foreach (var hit in result.Hits)
                {
                    Console.WriteLine("Found match: {0}", hit.Source.Text);
                    if (!hit.Highlights.Any()) continue;

                    foreach (var highlight in hit.Highlights.SelectMany(highlight => highlight.Value.Highlights))
                    {
                        Console.WriteLine("Found highlight: {0}", highlight);
                    }
                }
            }

            Console.WriteLine("Press any key to exit!");
            Console.ReadLine();
        }


    }

    internal class Document
    {
        public string Text { get; set; }
    }
}

Edit for comments: In this example, there's no real reason for the if(!hit.Highlights.Any()) continue;, except for being safe, but if you were to do the following query instead, you could end up with hits without highlights:

    var docs = new List<Document>
    {
        new Document{Text = "This is the first document", Number = 1 },
        new Document{Text = "This is the second document", Number =500 },
        new Document{Text = "This is the third document", Number = 1000 }
    };

    var searchDescriptor = new SearchDescriptor<Document>()
        .Index(indexName)
        .Query(q => q
            .Bool(b => b
                .Should(s1 => s1
                    .Match(m => m
                        .Query("second")
                        .OnField(f => f.Text)),
                    s2 => s2
                        .Range(r =>r
                            .OnField(f => f.Number)
                            .Greater(750)))
                 .MinimumShouldMatch(1)))
        .Highlight(h => h
            .OnFields(f => f
                .OnField(d => d.Text)
                .PreTags("<em>")
                .PostTags("</em>")));

  internal class Document
  {
      public string Text { get; set; }
      public int Number { get; set; }
  }

In this case, you could get a hit on the range query, but that wouldn't have any highlights.

For number 2, for me I just explored the object I got back from search, both in Quick Watch, the object browser and through IntelliSense in VS.

like image 79
anderso Avatar answered Sep 20 '22 19:09

anderso