Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fetching all the document URI's in MarkLogic Using Java Client API

Tags:

java

marklogic

i am trying to fetch all the documents from a database without knowing the exact url's . I got one query

DocumentPage documents =docMgr.read();
while (documents.hasNext()) {
    DocumentRecord document = documents.next();
    System.out.println(document.getUri());
}

But i do not have specific urls , i want all the documents

like image 299
Ankita Bhowmik Avatar asked Mar 10 '26 00:03

Ankita Bhowmik


1 Answers

The first step is to enable your uris lexicon on the database.

You could eval some XQuery and run cts:uris() (or server-side JS and run cts.uris()):

    ServerEvaluationCall call = client.newServerEval()
        .xquery("cts:uris()");
    for ( EvalResult result : call.eval() ) {
        String uri = result.getString();
        System.out.println(uri);
    }

Two drawbacks are: (1) you'd need a user with privileges and (2) there is no pagination.

If you have a small number of documents, you don't need pagination. But for a large number of documents pagination is recommended. Here's some code using the search API and pagination:

    // do the next eight lines just once
    String options =
        "<options xmlns='http://marklogic.com/appservices/search'>" +
        "  <values name='uris'>" +
        "    <uri/>" +
        "  </values>" +
        "</options>";
    QueryOptionsManager optionsMgr = client.newServerConfigManager().newQueryOptionsManager();
    optionsMgr.writeOptions("uriOptions", new StringHandle(options));

    // run the following each time you need to list all uris
    QueryManager queryMgr = client.newQueryManager();
    long pageLength = 10000;
    queryMgr.setPageLength(pageLength);
    ValuesDefinition query = queryMgr.newValuesDefinition("uris", "uriOptions");
    // the following "and" query just matches all documents
    query.setQueryDefinition(new StructuredQueryBuilder().and());
    int start = 1;
    boolean hasMore = true;
    Transaction transaction = client.openTransaction();
    try {
        while ( hasMore ) {
            CountedDistinctValue[] uriValues =
                queryMgr.values(query, new ValuesHandle(), start, transaction).getValues();
            for (CountedDistinctValue uriValue : uriValues) {
                String uri = uriValue.get("string", String.class);
                //System.out.println(uri);
            }
            start += uriValues.length;
            // this is the last page if uriValues is smaller than pageLength
            hasMore = uriValues.length == pageLength;
        }
    } finally {
        transaction.commit();
    }

The transaction is only necessary if you need a guaranteed "snapshot" list isolated from adds/deletes happening concurrently with this process. Since it adds some overhead, feel free to remove it if you don't need such exactness.

like image 83
Sam Mefford Avatar answered Mar 12 '26 15:03

Sam Mefford