Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch highlight with nested objects

I have a question about highlighting nested object fields.

Consider record like this:

_source: {

    id: 286
    translations: [
        {
            id: 568
            language: lang1
            value: foo1 bar1
        }
        {
            id: 569
            language: lang2
            value: foo2 bar2
        }
    ]

}

If the translations.value has ngram filter, is it possible to highlight matches in nested object such as this one? And how would the highlight query look like.

Thanks a lot for response.

like image 623
smolnar Avatar asked Mar 05 '13 17:03

smolnar


People also ask

What is nested in elastic search?

The nested type is a specialised version of the object data type that allows arrays of objects to be indexed in a way that they can be queried independently of each other.

What is a nested field?

When a packed class contains an instance field that is a packed type, the data for that field is packed directly into the containing class. The field is known as a nested field .

What is nested value?

A nested data structure is an array or object which refers to other arrays or objects, i.e. its values are arrays or objects. Such structures can be accessed by consecutively applying dot or bracket notation. Here is an example: const data = { code: 42, items: [{ id: 1, name: 'foo' }, { id: 2, name: 'bar' }] };

What is nested datatype?

Nested data types are structured data types for some common data patterns. Nested data types support structs, arrays, and maps. A struct is similar to a relational table. It groups object properties together.


1 Answers

Same problem over here. It seems that there is now way to do it in elastic search and won't be in near future.

Developer Shay Banon wrote:

In order to do highlighting based on the nested query, the nested documents needs to be extracted as well in order to highlight it, which is more problematic (and less performant).

Also:

His explanation was that this would take a good amount of memory as there can be a large number of children. And it looks genuine to me as adding this feature will violate the basic concept of processing only N number of feeds at a time.

So the only way is to process the result of a query manually in your own programm to add the highlights.

Update

I don't know about tire or ngram filters but i found a way to retrieve all filter matching nested documents by using nested facets and facet filters. You need a seperate query for highlighting but its much faster than browsing through _source, in my case at least.

{"query":
    {"match_all":{}},
    "facets":{
        "matching_translations":{
            "nested":"translations",
            "terms":{"field":"translations.value"},
            "facet_filter":{
                "bool":{"must":[{"terms":{"translations.value":["foo1"]}}]}
            }
        }
    }
}

You can use the resulting facet terms for highlighting in your programm.

For example: i want to highlight links to nested documents (in jquery):

 setHighlights = function(sdata){
        var highlightDocs = [];
        if(sdata['facets'] && sdata['facets']['docIDs'] && sdata['facets']['doctIDs']['terms'] && sdata['facets']['docIDs']['terms'].length >0){
            for(var i in sdata['facets']['docIDs']['terms']){
                highlightDocs.push(sdata['facets']['docIDs']['terms'][i]['term'])
            }
        }
        $('li.document_link').each(function(){
            if($.inArray($(this).attr('id'),highlightDocs) != -1) {
                $(this).addClass('document_selected');
            }
        });

I hope that helps a little.

like image 91
teano Avatar answered Nov 18 '22 22:11

teano