According to http://nest.azurewebsites.net/concepts/writing-queries.html, the && and || operators can be used to combine two queries using the NEST library to communicate with Elastic Search.
I have the following query set up:
var ssnQuery = Query<NameOnRecordDTO>.Match(
q => q.OnField(f => f.SocialSecurityNumber).QueryString(nameOnRecord.SocialSecurityNumber).Fuzziness(0)
);
which is then combined with a Bool query as shown below:
var result = client.Search<NameOnRecordDTO>(
body => body.Query(
query => query.Bool(
bq => bq.Should(
q => q.Match(
p => p.OnField(f => f.Name.First)
.QueryString(nameOnRecord.Name.First).Fuzziness(fuzziness)
),
q => q.Match(p => p.OnField(f => f.Name.Last)
.QueryString(nameOnRecord.Name.Last).Fuzziness(fuzziness)
)
).MinimumNumberShouldMatch(2)
) || ssnQuery
)
);
What I think this query means is that if the SocialSecurityNumber
matches, or both the Name.First
and Name.Last
fields match, then the record should be included in the results.
When I execute this query with the follow data for the nameOnRecord object used in the calls to QueryString:
"socialSecurityNumber":"123456789",
"name" : {
"first":"ryan",
}
the results are the person with SSN 123456789
, along with anyone with first name ryan
.
If I remove the || ssnQuery
from the query above, I get everyone whose first name is 'ryan'.
With the || ssnQuery
in place and the following query:
{
"socialSecurityNumber":"123456789",
"name" : {
"first":"ryan",
"last": "smith"
}
}
I appear to get the person with SSN 123456789 along with people whose first name is 'ryan' or last name is 'smith'.
So it does not appear that adding || ssnQuery
is having the effect that I expected, and I don't know why.
Here is the definition of the index on object in question:
"nameonrecord" : {
"properties": {
"name": {
"properties": {
"name.first": {
"type": "string"
},
"name.last": {
"type": "string"
}
}
},
"address" : {
"properties": {
"address.address1": {
"type": "string",
"index_analyzer": "address",
"search_analyzer": "address"
},
"address.address2": {
"type": "string",
"analyzer": "address"
},
"address.city" : {
"type": "string",
"analyzer": "standard"
},
"address.state" : {
"type": "string",
"analyzer": "standard"
},
"address.zip" : {
"type" : "string",
"analyzer": "standard"
}
}
},
"otherName": {
"type": "string"
},
"socialSecurityNumber" : {
"type": "string"
},
"contactInfo" : {
"properties": {
"contactInfo.phone": {
"type": "string"
},
"contactInfo.email": {
"type": "string"
}
}
}
}
}
I don't think the definition of the address
analyzer is important, since the address fields are not being used in the query, but can include it if someone wants to see it.
This was in fact a bug in NEST
A precursor to how NEST helps translate boolean queries:
NEST allows you to use operator overloading to create verbose bool queries/filters easily i.e:
term && term
will result in:
bool
must
term
term
A naive implementation of this would rewrite
term && term && term
to
bool
must
term
bool
must
term
term
As you can image this becomes unwieldy quite fast the more complex a query becomes NEST can spot these and join them together to become
bool
must
term
term
term
Likewise term && term && term && !term
simply becomes:
bool
must
term
term
term
must_not
term
now if in the previous example you pass in a booleanquery directly like so
bool(must=term, term, term) && !term
it would still generate the same query. NEST will also do the same with should
's when it sees that the boolean descriptors in play ONLY consist of should clauses
. This is because the boolquery does not quite follow the same boolean logic you expect from a programming language.
To summarize the latter:
term || term || term
becomes
bool
should
term
term
term
but
term1 && (term2 || term3 || term4)
will NOT become
bool
must
term1
should
term2
term3
term4
This is because as soon as a boolean query has a must clause the should start acting as a boosting factor. So in the previous you could get back results that ONLY contain term1
this is clearly not what you want in the strict boolean sense of the input.
NEST therefor rewrites this query to
bool
must
term1
bool
should
term2
term3
term4
Now where the bug came into play was that your situation you have this
bool(should=term1, term2, minimum_should_match=2) || term3
NEST identified both sides of the OR operation only contains should clauses and it would join them together which would give a different meaning to the minimum_should_match
parameter of the first boolean query.
I just pushed a fix for this and this will be fixed in the next release 0.11.8.0
Thanks for catching this one!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With