Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CosmosDb count distinct elements

Is there a direct function to count distinct elements in a CosmosDb query?

This is the default count:

SELECT value count(c.id) FROM c

And distinct works without count:

SELECT distinct c.id FROM c

But this returns a Bad Request - Syntax Error: SELECT value count(distinct c.id) FROM c

How would count and distinct work together?

like image 334
Ovi Avatar asked Dec 28 '25 15:12

Ovi


1 Answers

[Update 19 Nov 2020]

Here is another query that solves the district issue and works for count. Basically, you need to encapsulate the distinct and then count. We have tested it with paging for cases that you want the unique records and not just the count and it's working as well.

select value count(1) from c join (select distinct value c from p in c.products)

You can also use where clause inside and outside of the bracket depending on what your condition is based on.

This is also mentioned slightly differently in another answer here.

Check the select clause documentation for CosmosDB.

@ssmsexe brought this to my attention and I wanted to update the answer here.

[Original Answer]

Support for distinct has been added on 19th Oct 2018

The following query works just fine

SELECT distinct value c FROM c join p in c.products

However, it still doesn't work for count.

The workaround for counting distinct is to create a stored procedure to perform the distinct count. It will basically query and continue until the end and return the count.

If you pass a distinct query like above to the stored procedure below you will get a distinct count

function count(queryCommand) {
  var response = getContext().getResponse();
  var collection = getContext().getCollection();
  var count = 0;

  query(queryCommand);

  function query(queryCommand, continuation){
    var requestOptions = { continuation: continuation };
    var isAccepted = collection.queryDocuments(
        collection.getSelfLink(),
        queryCommand,
        requestOptions,
        function (err, feed, responseOptions) {
            if (err) {
                throw err;
            }

            //  Scan results
            if (feed) {
                count+=feed.length;
            }

            if (responseOptions.continuation) {
                //  Continue the query
                query(queryCommand, responseOptions.continuation)
            } else {
                //  Return the count in the response
                response.setBody(count);
            }
        });
    if (!isAccepted) throw new Error('The query was not accepted by the server.');
  }
}

The issue with that workaround is that it can potentially cross the RU limit on your collection and be unsuccessful. If that's the case you can implement a similar code on the server side which is not that great.

like image 74
Aboo Avatar answered Dec 31 '25 19:12

Aboo