Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

elasticsearch: can I defined synonyms with boost?

Let's say A, B, C are synonyms, I want to define B is "closer" to A than C

so that when I search the keyword A, in the searching results, A comes the first, B comes the second and C comes the last.

Any help?

like image 887
chrisyue Avatar asked Jun 27 '13 03:06

chrisyue


People also ask

How do you implement synonyms in Elasticsearch?

To use synonyms in elasticsearch you have to first create a synonym analyzer in settings to add synonym support for a particular field. Also in the settings you can define synonyms also. In the above settings i defined two analyzer for two different fields.

What is the use of boost in Elasticsearch?

Returns documents matching a positive query while reducing the relevance score of documents that also match a negative query. You can use the boosting query to demote certain documents without excluding them from the search results.

What is the synonym of inelastic?

1 inflexible; rigid, uncompromising.


1 Answers

There is no search-time mechanism (as of yet) to differentiate between matches on synonyms and source field. This is because, when indexed, a field's synonyms are placed into the inverted index alongside the original term, leaving all words equal.

This is not to say however that you cannot do some magic at index time to glean the information you want.

Create an index with two analyzers: one with a synonym filter, and one without.

PUT /synonym_test/
{
settings : {
  analysis : {
    analyzer : {
      "no_synonyms" : {
        tokenizer : "lowercase"
      },
      "synonyms" : {
        tokenizer : "lowercase",
        filter : ["synonym"]
      }
     },
     filter : {
       synonym : {
         type : "synonym",
         format: "wordnet",
         synonyms_path: "prolog/wn_s.pl"
        }
      }
   }
  }
}

Use a multi-field mapping so that the field of interest is indexed twice:

PUT /synonym_test/mytype/_mapping
{
   "properties":{
     "mood": {
       "type": "multi_field",
       "fields" : {
          "syn" : {"type" : "string", "analyzer" : "synonyms"},
          "no_syn" : {"type" : "string", "analyzer" : "no_synonyms"}
       }
     }
   }

}

Index a test document:

POST /synonym_test/mytype/1
{
  mood:"elated"
}

At search time, boost the score of hits on the field with no synonymn.

GET /synonym_test/mytype/_search
{
  query: {
    bool: {
      should: [
          { match: { "mood.syn" : { query: "gleeful", "boost": 3 } } },
          { match: { "mood.no_syn" : "gleeful" } }
      ]
    }
  }

}

Results in _score":0.2696457

Searching for the original term returns a better score:

GET /synonym_test/mytype/_search
{
  query: {
    bool: {
      should: [
          { match: { "mood.syn" : { query: "elated", "boost": 3 } } },
          { match: { "mood.no_syn" : "elated" } }
      ]
    }
  }

}

Results in: _score":0.6558018,"

like image 77
PhaedrusTheGreek Avatar answered Nov 15 '22 05:11

PhaedrusTheGreek