I'm using the below mentioned query to get distinct values from XML files stored in a collection in MarkLogic. Collection contains more than 40k files.
When the query is executed it takes a long time for the results. Is there any better way to optimize the below query or any other options to use this query without XPath.
Xquery:
fn:distinct-values(fn:collection(collectionName)//caseml/case[@jur eq in]/@year)
Input XML Example:
<?xml version="1.0" encoding="UTF-8"?>
<caseml>
  <case jur="in" series="mlj" volume="1" year="2016" startpage="129">
    <p num="y" pnum="22">
      <text>
        In view of the aforesaid discussion, we find the writ petition completely devoid
        of any merit and accordingly, we dismiss the same, leaving the parties to bear their
        own costs.
      </text>
    </p>
  </case>
</caseml>
The above XQuery is working, but need to get the results faster.
For fast atomic value retrieval across a large set of documents you want to configure a range index, which instructs MarkLogic to extract the values at index time and keep them in a memory-resident data structure so they can be accessed without touching the disk.  Since you want the values at a specific path you'll want to configure a path range index.  After reindexing you can use cts:values to retrieve the values.  You can optionally pass a cts:query to the call to restrict things to documents matching some criteria.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With