I'm trying to use a synonym filter to search for a phrase.
peter=> spider man, spiderman, Mary Jane, .....
I use the default configuration. When I put these synonyms into synonym.txt
and restart Solr it seems to work only partially: It starts to search for "spider"
, "man"
, "spiderman"
, "Mary"
and "Jane"
but what I want to search for are the meaningful combinations - like "spider man"
, "Mary Jane"
and "spiderman"
.
Yes sadly this is a well known problem due to how the Solr query parser breaks up on whitespace before analyzing. So instead of seeing "spider" before "man" in the token stream, you instead simply see each word on its own. Just "spider" with nothing before/after and just "man" with nothing before/after.
This is because most Solr query forms see a space as basically an "OR". Search for "spider OR man" instead of looking at the full text, analyzing it to generate synonyms, then generating a query from that.
For more background, there's this blog post
There's a large number of solutions to this problem, including the following:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With