In Solr you can perform an ordered proximity search using syntax
"word1 word2"~10
By ordered, I mean word1 will always come before word2 in the document. I would like to know if there is an easy way to perform an unordered proximity search, ie. word1 and word2 occur within 10 words of each other and it doesn't matter which comes first.
One way to do this would be:
"word1 word2"~10 OR "word2 word1"~10
The above will work but I'm looking for something simpler, if possible.
Slop means how many word transpositions can occur. So "a b" is going to be different than "b a" because a different number of transpositions are allowed.
a foo b
has positions (a,1), (foo, 2), (b, 3). To match (a,1), (b,2) will require one change: (b,2) => (b,3)In general, if "a b"~n
matches something, then "b a"~(n+2)
will match it too.
EDIT: I guess I never gave an answer. I see two options:
I think #2 is probably better, unless your slop is very large to begin with.
Are you sure it's already doesn't work like that? There is nothing in documentation saying that it's 'ordered':
A proximity search can be done with a sloppy phrase query. The closer together the two terms appear in the document, the higher the score will be. A sloppy phrase query specifies a maximum "slop", or the number of positions tokens need to be moved to get a match.
This example for the standard request handler will find all documents where "batman" occurs within 100 words of "movie":
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_search_for_one_term_near_another_term_.28say.2C_.22batman.22_and_.22movie.22.29
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With