Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

More_like_this query with a filter

I have 1702 documents indexed in elastic search which has category as one of the fields and it also has a field named SequentialId.

I initially fetched the documents with category 1.1 which are between the document 1 and document 850 like below.

**POST testucb/docs/_search
{
    "size": 1702, 
    "query": {
        "bool": {
            "must": [
               {"match": {
                  "Category": "1.1"
               }}
            ],
            "filter":[
                {
                    "range":
                    {
                        "SequentialId":
                        {
                            "gte":1,
                            "lte":850

        }
    }
}
]
}
}
}**

the above query gave me 834 documents which matched category 1.1.(I have the binary to parse out the 834 _ids from the resultant JSON output.) My goal now is to provide these 834 _ids into more_like this query as a training set for the remaining documents which is my test set(docs from sequentialid 851 to 1702 is my test set)

I tried this more_like_this query below with the filter.

POST /testucb/docs/_search
{

"size": 1702, 
    "fields": [
            "SequentialId",
            "Category",
            "PRIMARY_CONTENT_EN"
         ],
   "query": {
      "more_like_this": 
      {
         "fields": [
            "PRIMARY_CONTENT_EN"
         ],
        "like":[
           <-----------834 _ids goes here ---->
            ],
            **"filter":[
                {
                    "range":
                    {
                        "SequentialId":
                        {
                            "gte":851,
                            "lte":1702**

        }
    }
}
],
        "min_term_freq": 1,
        "min_doc_freq": 1,
         "max_query_terms": 15,            
        "min_word_len": 3,

        "stop_words": [
                   ], 
        "boost": 2,
        "include":false
}
}
}

I am getting query parsing exception which says MLT does not support filter. I am not sure how I can provide the remaining documents with sequentialid from 851 to 1702 as my test set .

I hope am clear with what I am expecting to accomplish.Can you guys please help me how to accomplish my task? I am new to elastic search .

like image 246
Sai Avatar asked Mar 30 '16 04:03

Sai


1 Answers

If you want to do a more like this query and filter beforehand, you should use a bool query with filter clause (Elasticsearchversion > 2.0)

POST /testucb/docs/_search
{
  "size": 1702,
  "fields": [
    "SequentialId",
    "Category",
    "PRIMARY_CONTENT_EN"
  ],
  "query": {
    "bool": {
      "must": [
        {
          "more_like_this": {
            "fields": [
              "PRIMARY_CONTENT_EN"
            ],
            "like": [
              <-----------834 _ids goes here ---->
            ],
            "min_term_freq": 1,
            "min_doc_freq": 1,
            "max_query_terms": 15,
            "min_word_len": 3,
            "stop_words": [],
            "boost": 2,
            "include": false
          }
        }
      ],
      "filter": {
        "range": {
          "SequentialId": {
            "gte": 851,
            "lte": 1702
          }
        }
      }
    }
  }
}

If you use an older version of elasticsearch, you should use the filtered query instead

like image 130
Michael Stockerl Avatar answered Oct 02 '22 14:10

Michael Stockerl