I am having a hard time finding details about what the output of xdmp:plan
means.
Having a simple query like this:
xdmp:plan(cts:search(doc(), cts:element-value-query(xs:QName("description"), "some text")))
results in quite a long execution plan:
<qry:query-plan xmlns:qry="http://marklogic.com/cts/query">
<qry:expr-trace>...</qry:expr-trace>
...
<qry:partial-plan>
<qry:term-query weight="1">
<qry:key>16037778974159125508</qry:key>
<qry:annotation>element(description,value("some","text"))</qry:annotation>
</qry:term-query>
</qry:partial-plan>
...
<qry:ordering></qry:ordering>
<qry:final-plan>
<qry:and-query>
<qry:term-query weight="1">
<qry:key>16037778974159125508</qry:key>
<qry:annotation>element(description,value("some","text"))</qry:annotation>
</qry:term-query>
</qry:and-query>
</qry:final-plan>
<qry:info-trace>Selected 0 fragments to filter</qry:info-trace>
<qry:result estimate="0"></qry:result>
</qry:query-plan>
The only part of the documentation mentioning xdmp:plan
is its documentation itself. Other than that i can not find anything else. I'd like some details about what e.g. qry:key
or qry:annotation
really mean.
Is there any documentation i am missing describing the possible output of xdmp:plan
. As this is a really valuable tool in order to understand query performance, i expected it to be rather well documented.
Edit: This marklogic blog post i found gives some examples of how a query plan can be interpreted.
Still, i feel like a blog post should not be the only reasonable documentation for this tool.
Some questions still on my mind:
partial-plan
and a final-plan
. Is a final-plan
a merge of all partial-plans
? For what and when is a partial-plan
used? Partial-plans seem to contribute constraints. Are these constraints used at index resolution stage to find candidate fragment ids? What role does a final-plan
play there? Is a final-plan
used to filter out false-positives after the index resolution ?Sometimes i can find this in the query plan:
<qry:elem-word-trace text="computer" elem-name="title" elem-uri="">
<qry:key>10975994818398622042</qry:key>
</qry:elem-word-trace>
qry:elem-word-trace
mean?<qry:ordering></qry:ordering>
?
Added a simple description about ordering to my answer./doc[id = 1]
outputs the following 2 times:Is there a reason for that ? Why does step 2 predicate 1 contribute the same partial-plan
twice?
<qry:info-trace>Step 2 predicate 1 contributed 1 constraint: id = 1</qry:info-trace>
<qry:partial-plan xmlns:qry="...">...</qry:partial-plan>
<qry:info-trace>Step 2 predicate 1 contributed 1 constraint: id = 1</qry:info-trace>
<qry:partial-plan xmlns:qry="...">...</qry:partial-plan>
After some more searching and reading i decided to summarize my findings.
Note: If you are not using fragmentation, every use of "fragment" can be put on par with "document".
A partial-plan
just shows the incremental pieces of the plan as the come in and seem to mostly be just for informational use.
The full-plan
on the other hand is the request how it is sent to the index and thus most of the time the interesting part.
The documentation of query-trace gives some insight of what the info-trace
messages mean:
Having a filtered query results in a info-trace
describing how many candidate fragments references were returned from the index resolution stage of query processing:
xdmp:plan(cts:search(doc(), cts:element-word-query(xs:QName("title"), "computer")))
=> ...
<qry:info-trace>Selected 2 fragments to filter</qry:info-trace>
A unfiltered query logs the same message but without the "to filter" indicating that the second filtering step is not executed:
xdmp:plan(cts:search(doc(), cts:element-word-query(xs:QName("title"), "computer"), ("unfiltered")))
=> ...
<qry:info-trace>Selected 2 fragments</qry:info-trace>
<qry:result estimate="2"></qry:result>
The estimate
in qry:result
shows how many fragments match the query using the index information alone. So this is a estimated number before the filtering step, thus might contain false-positives. I think the value of estimate and the log of info-trace described above is always the same.
Having a element-word-query
with the only word searches
enabled (fast element word searches
disabled) returns this final-plan
:
xdmp:plan(cts:search(doc(), cts:element-word-query(xs:QName("title"), "computer")))
=> ...
<qry:final-plan>
<qry:and-query>
<qry:term-query weight="1">
<qry:key>13967911917401594192</qry:key>
<qry:annotation>word("computer")</qry:annotation>
</qry:term-query>
<qry:term-query weight="0">
<qry:key>745773915438417736</qry:key>
<qry:annotation>element(title)</qry:annotation>
</qry:term-query>
</qry:and-query>
</qry:final-plan>
Having two separate term-queries with one word("computer")
and one element(title)
means it will also return documents containing the word "computer" outside of element title
. So a unfiltered search could return false-positives.
Having a element-word-query
with both word searches
and fast element word searches
enabled returns this final-plan
:
<qry:final-plan>
<qry:and-query>
<qry:term-query weight="1">
<qry:key>10975994818398622042</qry:key>
<qry:annotation>element(title,word("computer"))</qry:annotation>
</qry:term-query>
</qry:and-query>
</qry:final-plan>
Here annotation
indicates a combined search for word "computer" inside the title
element. This query could be unfiltered and still return no false-positives in my case.
More detailed information in this blog post.
That <qry:ordering>
tag indicates that the resulting candidate fragment references are getting ordered. This can be controlled with one of the cts:order Constructors in the cts:search function. Example:
xdmp:plan(
cts:search(
doc(),
cts:element-word-query(xs:QName("title"), "computer"),
(cts:unordered())
))
=>....
<qry:ordering>
<qry:unordered></qry:unordered>
</qry:ordering>
I always wondered, how to see if a index is used or not (being used to query execution plans where you have like a full index scan). Ultimately you can tell quite easy if a index is used or not:
Search for <qry:info-trace>
logs, which contain searchable
. Messages which contain searchable
are good, meaning this part of your query can be executed using a index. If it contains the word unsearchable
, this might be a bad sign.
The log message for xdmp:plan(//image/id[. = "1"]/..)
could look like this:
<qry:info-trace>Analyzing path: fn:collection()/descendant::image/id[. = "1"]/..</qry:info-trace>
<qry:info-trace>Step 1 is searchable: fn:collection()</qry:info-trace>
<qry:info-trace>Step 2 is searchable: descendant::image</qry:info-trace>
<qry:info-trace>Step 3 is searchable: id[. = "1"]</qry:info-trace>
<qry:info-trace>Step 4 axis is unsearchable: parent</qry:info-trace>
<qry:info-trace>Step 4 is unsearchable: ..</qry:info-trace>
Meaning all parts except Step 4, the /..
can be resolved by the index. This might not be a bad sign, depending on your query. In this case, the query could be modified though:
This slightly modified query can use the index for all "steps" xdmp:plan(//image[id = "1"]);
<qry:info-trace>Analyzing path: fn:collection()/descendant::image[id = "1"]</qry:info-trace>
<qry:info-trace>Step 1 is searchable: fn:collection()</qry:info-trace>
<qry:info-trace>Step 2 is searchable: descendant::image[id = "1"]</qry:info-trace>
<qry:info-trace>Path is fully searchable.</qry:info-trace>
Further details can be found here.
If someone finds more information on how to interpret and work with xmdp:plan
output i'd be happy to know about it.
Update 17.11.2018:
Found this really interesting video where Mary Holstege talks about MarkLogic Search and Indexes. This covers a whole lot of my questions and i can really recommend it.
I would also add, if you see a term in the final plan with no annotation, that is a bug, and you should report it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With