Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i get unique suggestions without duplicates when i use completion suggester?

I am using elastic 5.1.1 in my environment. I have chosen completion suggester on a field name post_hashtags with an array of strings to have suggestion on it. I am getting response as below for prefix "inv"

Req:

POST hashtag/_search?pretty&&filter_path=suggest.hash-suggest.options.text,suggest.hash-suggest.options._source
{"_source":["post_hashtags" ],

"suggest": {
    "hash-suggest" : {
        "prefix" : "inv",
        "completion" : {
            "field" : "post_hashtags"
        }
    }
}

Response :

{
  "suggest": {
    "hash-suggest": [
      {
        "options": [
          {
            "text": "invalid",
            "_source": {
              "post_hashtags": [
                "invalid"
              ]
            }
          },
          {
            "text": "invalid",
            "_source": {
              "post_hashtags": [
                "invalid",
                "coment_me",
                "daya"
              ]
            }
          }
        ]
      }
    ]
  }

Here "invalid" is returned twice because it is also a input string for same field "post_hashtags" in other document.

Problems is if same "invalid" input string present in 1000 documents in same index then i would get 1000 duplicated suggestions which is huge and not needed.

Can I apply an aggregation on a field of type completion ?

Is there any way I can get unique suggestion instead of duplicated text field, even though if i have same input string given to a particular field in multiple documents of same index ?

like image 446
Adinarayana Immidisetti Avatar asked Feb 22 '17 12:02

Adinarayana Immidisetti


2 Answers

ElasticSearch 6.1 has introduced the skip_duplicates operator. Example usage:

{
  "suggest": {
    "autocomplete": {
      "prefix": "MySearchTerm",
      "completion": {
        "field": "name",
        "skip_duplicates": true
      }
    }
  }
}
like image 172
Mathew Avatar answered Oct 19 '22 20:10

Mathew


Edit: This answer only applies to Elasticsearch 5

No, you cannot de-duplicate suggestion results. The autocomplete suggester is document-oriented in Elasticsearch 5 and will thus return suggestions for all documents that match.

In Elasticsearch 1 and 2, the autocomplete suggester automatically de-duplicated suggestions. There is an open Github ticket to bring back this functionality, and it looks like it is possible to do so in a future version.

For now, you have two options:

  1. Use Elasticsearch version 1 or 2.
  2. Use a different suggestion implementation not based on the autocomplete suggester. The only semi-official suggestion I have seen so far involve putting your suggestion strings in a separate index.
like image 1
dlebech Avatar answered Oct 19 '22 18:10

dlebech