I have regularly performed LINQ predicates when querying CosmosDB specific documents. However, today I filled up my CosmosDB with 100 k+ documents. The performance was very slow. As the SQL Query in the Azure Portal was noticeably faster, I tried using SqlQuerySpec. Voilá! It worked so much faster.
Can anyone tell me the what is happening under the hood, when using Linq predicates with CosmosDB and why it slows my queries down?
The below code is used in my method for getting the document. NB: The id is the partition key in this case.
var collectionUri = UriFactory.CreateDocumentCollectionUri(CDBdatabase, CDBcollection);
var sqlStatement = new SqlQuerySpec
{
QueryText = "SELECT * FROM c where c.id = @id",
Parameters = new SqlParameterCollection()
{
new SqlParameter("@id", consumerId),
},
};
IDocumentQuery<T> query = documentClient.CreateDocumentQuery<T>(
collectionUri,
sqlStatement,
.AsDocumentQuery();
List<ConsumerDetails> results = new List<ConsumerDetails>();
while (query.HasMoreResults)
{
results.AddRange(await query.ExecuteNextAsync<ConsumerDetails>());
}
return results.FirstOrDefault();
Versus, the slower code:
return documentClient.CreateDocumentQuery<ConsumerDetails>(
collectionUri,
.Where(f => f.Id == consumerId).AsEnumerable().FirstOrDefault();
The answer lies on the way you query.
The two code piece that you posted, are not the same.
For them to be the same the second one would have to look like this:
var collectionUri = UriFactory.CreateDocumentCollectionUri(CDBdatabase, CDBcollection);
var query = documentClient.CreateDocumentQuery<ConsumerDetails>(
collectionUri)
.Where(f => f.Id == consumerId)
.AsDocumentQuery();
List<ConsumerDetails> results = new List<ConsumerDetails>();
while (query.HasMoreResults)
{
results.AddRange(await query.ExecuteNextAsync<ConsumerDetails>());
}
return results.FirstOrDefault();
In both cases it is SQL that you will execute against CosmosDB. However in the LINQ case, a LINQ translator will kick in to convert your expression to a SQL query.
Also in the SQL example, because you are pointing directly to the lowercase id
(which is the CosmosDB id
) which is also the partition key, CosmosDB will recognise that and limit the query from a cross partition one to a partition specific one, making it faster and cheaper. The .Where(f => f.Id == consumerId)
(with uppercase Id
), once it goes through the LINQ provider, will be translated to SELECT * FROM c where c.Id = consumerId
unless if there is a JsonAttribute("id")
decorating the Id
property. This means that you need to provide the PartitionKey
value in the FeedOptions
for the query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With