I got a 16GB collection with 2 partitions. When I query a document by it's Id, it is very slow. But querying by an indexed field is fast. Both are cross-partition query and if I pass partition key with the query it is fast but partition key is not always available for my query. Got similar results using .NET SDK and Document Explorer Query in Azure Portal.
The collection has custom indexing policy but as far as I know you don't need to index Id
or it may not be even possible.
Here are my queries and their corresponding request charges.
SELECT * FROM c where c.id = 'unique-id-123'
-- Request Charge: 344940.79 RUs, Document Count: 1
SELECT * FROM c WHERE c.otherId = 'NOT-so-uniqueId-123'
-- Request Charge: 5.08 RUs, Document Count: 3
As you know, Id is unique so the query returns 1 document while the second query is filtered by otherId
which is not so unique and returns 3 documents. Also notice the insanely high RUs consumption with the first query.
So why the 2nd query is faster than by Id?
Update:
Here are the gathered metrics for the above queries.
Query By Id:
Read 1 records in 1497 ms, 339173.109 RU, Size: 6873022 KB
QueryPreparationTime(ms): CompileTime = 2, LogicalBuildTime = 0,
PhysicalPlanBuildTime = 0, OptimizationTime = 0
QueryEngineTime(ms): DocumentLoadTime = 1126, IndexLookupTime = 0,
RuntimeExecutionTimes = 356, WriteOutputTime = 0
Query by indexed field:
Read 4 records in 2 ms, 7.56 RU, Size: 9 KB
QueryPreparationTime(ms): CompileTime = 0, LogicalBuildTime = 0,
PhysicalPlanBuildTime = 0, OptimizationTime = 0
QueryEngineTime(ms): DocumentLoadTime = 0, IndexLookupTime = 1,
RuntimeExecutionTimes = 0, WriteOutputTime = 0
These proves that query by Id is doing table scan as most of the time spent was from DocumentLoadTime
and no value for IndexLookupTime
.
But I thought Id should be the primary key and indexed by default as per this answer by @andrew-liu.
Microsoft support responded and they've resolved the issue. They've added IndexVersion
2 for the collection. Unfortunately, it is not yet available from the portal and newly created accounts/collection are still not using the new version. You'll have to contact Microsoft Support to made changes to your accounts.
Here are the new results from a collection with index version 2 and there's a massive improvement.
SELECT * FROM c where c.id = 'uniqueValue'
-- Index Version 1: Request Charge: 344,940.79 RUs
-- Index Version 2: Request Charge: 3.31 RUs
SELECT * FROM c WHERE c.indexedField = 'value' AND c.id = 'uniqueValue'
-- Index Version 1: Request Charge: 150,666.22 RUs
-- Index Version 2: Request Charge: 5.65 RUs
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With