Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch highlight on array, how to return entire array

I'm trying to do some highlighting in Elasticsearch and am having difficulty getting the output I'd like for multi-value fields. Here's what I'm doing now:

{ "query" : { "match": { "nameSet": "test" } }, 
  "highlight" : { "fields" : { "*": {"number_of_fragments": 0 } } }
}

This gives me (omitted unnecessary fields):

"hits" : [ {
    ...
    "_source" : {
        "nameSet" : ["TEST", "NAME"]
    },
    "highlight" : {
       "nameSet" : [ "<em>TEST</em>" ]
    }
 }, 
 ...

What I would like to have is the full array, and not just the item that matched. In this example, I would like "TEST" to be emphasized and "NAME" to be present but not emphasized.

"hits" : [ {
    ...
    "_source" : {
        "nameSet" : ["TEST", "NAME"]
    },
    "highlight" : {
       "nameSet" : [ "<em>TEST</em>", "NAME" ] 
    }
 }, 
 ...

Any way to do this purely in ES?

Thanks.

like image 960
Jonathan Rile Avatar asked Mar 02 '26 18:03

Jonathan Rile


1 Answers

There are currently two open threads about this, the first from over a decade ago:

Highlighting array field - Also return non-matching entries

Use path to indicate highlighted index of array items

The general approach for now seems to be stripping the highlight tags, doing a comparison to the array elements, and then replacing the matching elements with the highlight. Obviously this is not ideal, but it should work in all cases, if done carefully.

// We can use this to generate a function to strip whichever highlights
// we chose to use
function stripHighlights(open, close, highlight) {
    return function(highlight) {
        return highlight.replaceAll(open, "").replaceAll(close, "")
    }
}

// In this case, we use "em" because they are the ones from the example
let stripEmTags = stripHighlights("<em>", "</em>");

// This does the actual work of replacing the source array element
// with the highlighted one, if it was highlighted
function highlightedArrayField(sourceField, highlightField) {
    let strippedHighlights = highlightField.map(stripEmTags);

    return sourceField.map(elem => {
        let matchedIdx = strippedHighlights.indexOf(elem);
        if(matchedIdx == -1) {
            return elem;
        }

        return highlightField[matchedIdx];
    });
}

// This is the general function we can call to go through the hits and
// determine their replaced source values.
// We could use an Object.assign to update the properties of the hit if needed.
function highlightedHit(hit) {
    return Object.keys(hit.highlight).reduce((agg, field) => {
        let highlightField = hit.highlight[field];
        let sourceField = hit._source[field];

        if(Array.isArray(sourceField)) {
            agg[field] = highlightedArrayField(sourceField, highlightField);
        } else {
            agg[field] = highlightField;
        }
        return agg;
    }, {});
}
like image 101
MirroredFate Avatar answered Mar 04 '26 21:03

MirroredFate



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!