I am trying to get keyword-tokenized multi-word synonyms working with the _analyze API. The API is returning expected results for single-word synonyms, however, not for multi-word ones. Here is my settings and analysis chain:
curl -XPOST "http://localhost:9200/test" -d'
{
"settings": {
"index": {
"analysis": {
"filter": {
"my_syn_filt": {
"type": "synonym",
"synonyms": [
"foo bar, fooo bar",
"bazzz, baz"
]
}
},
"analyzer": {
"my_synonyms": {
"filter": [
"lowercase",
"my_syn_filt"
],
"tokenizer": "keyword"
}
}
}
}
}
}'
Now test using the _analyze API:
curl 'localhost:9200/test/_analyze?analyzer=my_synonyms&text=baz'
The call returns what I expect (the same result is returned for 'bazzz' as well):
{
"tokens": [
{
"position": 1,
"type": "SYNONYM",
"end_offset": 3,
"start_offset": 0,
"token": "bazzz"
},
{
"position": 1,
"type": "SYNONYM",
"end_offset": 3,
"start_offset": 0,
"token": "baz"
}
]
}
Now when I try the same call with the multi-word synonym text the API only returns one token of type 'word', no synonyms:
curl 'localhost:9200/test/_analyze?analyzer=my_synonyms&text=foo+bar'
(returns)
{
"tokens": [
{
"position": 1,
"type": "word",
"end_offset": 7,
"start_offset": 0,
"token": "foo bar"
}
]
}
Why isn't the analyze API returning both "foo bar" AND "fooo bar" tokens with type SYNONYM?
The "tokenizer":"keyword" key-value ALSO needs to be added to the my_syn_filt filter declaration as follows:
curl -XPOST "http://localhost:9200/test" -d'
{
"settings": {
"index": {
"analysis": {
"filter": {
"my_syn_filt": {
"tokenizer": "keyword",
"type": "synonym",
"synonyms": [
"foo bar, fooo bar",
"bazzz, baz"
]
}
},
"analyzer": {
"my_synonyms": {
"filter": [
"lowercase",
"my_syn_filt"
],
"tokenizer": "keyword"
}
}
}
}
}
}'
With the above mapping the _analyze API returns the desired SYNONYM tokens:
{
"tokens": [
{
"position": 1,
"type": "SYNONYM",
"end_offset": 7,
"start_offset": 0,
"token": "foo bar"
},
{
"position": 1,
"type": "SYNONYM",
"end_offset": 7,
"start_offset": 0,
"token": "fooo bar"
}
]
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With