Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch - filter and sort by name with Priority exceptions

I'm trying to filter and sort by name with Priority exceptions, which means that even if the results is sorted alphabetically I want a specific name to appear first.

For example - this is my base query

{
  "from": 0,
  "size": 500,
  "min_score": 0.15,
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "exists": {
                "field": "brand.id"
              }
            }
          ]
        }
      }
    }
  },
  "sort": [
    {
      "brand.names.1.raw": "asc"
    }
  ]
}

In short, I want this array ["pepsi" , "rc-cola", "coca-cola"] to be sorted by giving top priority to "rc-cola" so that it will be sorted as ["rc-cola", "coca-cola", "pepsi"]

Right now it sorts alphabetically. I thought about a few ideas that could work:

  1. Add a "should" with boosting by "match". but then I had problem with sort by "_score", it breaks my alphabeticaly sorting, although I first sort by "_score" and by the brand name. example with adding this to the "bool": "should":[{"match":{"brand.id":{"query":34709,"boost":20}}}

  2. I tried with "aggregates" so that the first query (bucket) would be "match" the specific brand name and sorts alphabeticaly inside, the second query would sorts alphabeticaly only. but I totally messed up.

I have to use filtered -> filters, I can't use scripts queries. Thanks.

UPDATE Here is an example of documents and how it is sorted right now. I want the "ccc" brand to be prioritized first, please help me update my query.

{
  "_index": "retailer1",
  "_type": "product",
  "_id": "1",
  "_score": null,
  "_source": {
    "id": 1,
    "brand": {
      "names": {
        "1": "aaa"
      },
      "id": 405
    }
  },
  "sort": [
    "aaa"
  ]
},
{
  "_index": "retailer1",
  "_type": "product",
  "_id": "2",
  "_score": null,
  "_source": {
    "id": 2,
    "brand": {
      "names": {
        "1": "bbb"
      },
      "id": 406
    }
  },
  "sort": [
    "bbb"
  ]
},
{
  "_index": "retailer1",
  "_type": "product",
  "_id": "3",
  "_score": null,
  "_source": {
    "id": 3,
    "brand": {
      "names": {
        "1": "ccc"
      },
      "id": 407
    }
  },
  "sort": [
    "ccc"
  ]
},
like image 218
Beni Gazala Avatar asked Mar 07 '17 15:03

Beni Gazala


3 Answers

If using Elasticsearch version 1.x, following query should provide you with the expected result: (might have to adapt a bit to use raw fields if needed)

{
  "from": 0,
  "size": 500,
  "query": {
    "filtered": {
      "query": {
        "bool": {
          "should": [
            {
              "term": {
                "brand.names.1": "ccc",
                "boost": 10
              }
            },
            {
              "exists": {
                "field": "brand.id"
              }
            }
          ]
        }
      },
      "filter": {
        "exists": {
          "field": "brand.id"
        }
      }
    }
  },
  "sort": [
    "_score",
    {
      "brand.names.1": {
        "order": "asc"
      }
    }
  ]
}

On later versions of Elasticsearch The filtered query is replaced by the bool query, this query should do the job (with similar adaptation as previous one for use of raw fields if needed)

{
  "from": 0,
  "size": 500,
  "query": {
    "bool": {
      "filter": {
        "exists": {
          "field": "brand.id"
        }
      },
      "should": [
        {
          "term": {
            "brand.names.1": "ccc"
          }
        }
      ]
    }
  },
  "sort": [
    "_score",
    {
      "brand.names.1": {
        "order": "asc"
      }
    }
  ]
}

In both cases, you can make use of the boost function if you want to have the top filled with more than one preferred match, in a given order

like image 165
Olivier Avatar answered Nov 04 '22 16:11

Olivier


I have tested localy here it is. Also to simplify query ask NOT for name but for brand IDS since your brand could have many names. If you still wish to do sort on names then you can modify script as you wish

POST stack/_search
{
  "query": {
    "function_score": {
      "boost_mode": "replace",
      "query": {
        "bool": {
          "must": [
            {
              "exists": {
                "field": "brand.id"
              }
            }
          ]
        }
      },
      "script_score": {
        "script": {
          "params": {
            "ids": [
              406,
              405
            ]
          },
          "inline": "return params.ids.indexOf(doc['brand.id'].value) > -1 ? 1000 - params.ids.indexOf(doc['brand.id'].value) : _score;"
        }
      }
    }
  }
}
like image 3
Vova Bilyachat Avatar answered Nov 04 '22 16:11

Vova Bilyachat


If the priority of brand is known at index time, then you can directly index it in your doc like :

"brand": {
      "name": "ccc",
      "priority":1000,
      "id": 407
    }

The brands to be shown on top can have a high popularity value, whereas the rest can have a popularity value assigned to a lower value.

By indexing it this way, you can directly sort it by using brand.popularity as the primary sort and brand.names as the secondary sort

"sort" : [
        { "brand.priority" : {"order" : "desc"}},
        { "brand.name" : {"order" : "asc" }}
    ]
like image 3
Rahul Avatar answered Nov 04 '22 17:11

Rahul