Given this document :-
<items>
<item><type>T1</type><value>V1</value></item>
<item><type>T2</type><value>V2</value></item>
</items>
unsurprisingly, I find that this will pull back the page in a cts:uris()
:-
cts:and-query((
cts:element-query(xs:QName('item'),
cts:element-value-query(xs:QName('type'),'T1')
),
cts:element-query(xs:QName('item'),
cts:element-value-query(xs:QName('value'),'V2')
)
))
but somewhat surprisingly (to me at least) I also find that this will too :-
cts:element-query(xs:QName('item'),
cts:and-query((
cts:element-value-query(xs:QName('type'),'T1'),
cts:element-value-query(xs:QName('value'),'V2')
))
)
This doesn't seem right, as there is no single item with type=T1 and value=V2. To me this seems like a false positive.
Have I misunderstood how cts:element-query
works?
(I have to say that the documentation isn't particularly clear in this area).
Or is this something where MarkLogic strives to give me the result I expect, and had I had more or better indexes in place, I would be less likely to get a false positive match.
In addition to the answer by @wst, you only need to enable element value positions
to get accurate results from unfiltered search. Here some code to show this:
xdmp:document-insert("/items.xml", <items>
<item><type>T1</type><value>V1</value></item>
<item><type>T2</type><value>V2</value></item>
</items>);
cts:search(collection(),
cts:element-query(xs:QName('item'),
cts:and-query((
cts:element-value-query(xs:QName('type'),'T1'),
cts:element-value-query(xs:QName('value'),'V2')
))
), 'unfiltered'
)
Without element value positions
enabled this returns the test document. After enabling the positions, the query returns nothing.
As said by @wst, cts:search()
runs filtered by default, whereas cts:uris()
(and for instance xdmp:estimate()
only runs unfiltered.
HTH!
Yes, I think this is a slight misunderstanding of how queries work. In cts:search
, the default behavior is to enable the filtered
option. In this case ML will evaluate the query using only indexes, and then once candidate documents have been selected, it will load them into memory, inspect, and filter out false positives. This is more time consuming, but more accurate.
cts:uris
is a lexicon function, so queries passed to it will only resolve via indexes, and there is no option to filter false positives.
The simple way to handle this query via indexes would be to change your schema such that documents are based on <item>
instead of <items>
. Then each item would have a separate index entry, and results would not be commingled before filtering.
Another way that doesn't involve updating documents is to wrap the queries you expect to occur in the same element in a cts:near-query
. That would prevent a <type>
in one <item>
from matching with a <value>
in a different <item>
. I suggest reading the documentation because you may need to enable one or more position-based indexes for cts:near-query
to be accurate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With