Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why am I seeing different index behaviour between 2 seemingly identical CosmosDb Collections

I'm trying to debug a very strange discrepency between 2 seperate cosmos db collection that on face value are configured the same.

We recently modified some code that executed the following query.

OLD QUERY

SELECT * FROM c 
WHERE c.ProductId = "CODE" 
AND c.PartitionKey = "Manufacturer-GUID"

NEW QUERY

SELECT * FROM c
WHERE (c.ProductId = "CODE" OR ARRAY_CONTAINS(c.ProductIdentifiers, "CODE")) 
AND c.PartitionKey = "Manufacturer-GUID"

The introduction of that Array_Contains call in the production environment has tanked the performance of this query from ~3 RU/s ==> ~6000 RU/s. But only in the production environment.

And the cause seems to be that in Production, it's not hitting an index. See output below for the two environments.

DEV CONFIGURATION

Collection Scale: 2000 RU/s

Index Policy: (Notice the Exclude path for ETags)

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*",
            "indexes": [
                {
                    "kind": "Range",
                    "dataType": "Number",
                    "precision": -1
                },
                {
                    "kind": "Range",
                    "dataType": "String",
                    "precision": -1
                },
                {
                    "kind": "Spatial",
                    "dataType": "Point"
                }
            ]
        }
    ],
    "excludedPaths": [
        {
            "path": "/\"_etag\"/?"
        }
    ]
}

PROD CONFIGURATION

Collection Scale: 10,000 RU/s

Index Policy: (Notice the absense of an Exclude path for ETags compared to DEV)

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*",
            "indexes": [
                {
                    "kind": "Range",
                    "dataType": "Number",
                    "precision": -1
                },
                {
                    "kind": "Range",
                    "dataType": "String",
                    "precision": -1
                },
                {
                    "kind": "Spatial",
                    "dataType": "Point"
                }
            ]
        }
    ],
    "excludedPaths": []
}

When comparing the Output results from both environments, DEV is showing an index hit, while PROD is showing an index miss, despite no noticeable difference between the index policies.

Results in DEV

Request Charge:           3.490 RUs
Showing Results:          1 - 1
Retrieved document count: 1
Retrieved document size:  3118 bytes
Output document count:    1
Output document size:     3167 bytes
Index hit document count: 1

Results in PROD

Request Charge:           6544.870 RUs
Showing Results:          1 - 1
Retrieved document count: 124199
Retrieved document size:  226072871 bytes
Output document count:    1
Output document size:     3167 bytes
Index hit document count: 0

The only thing I've been able to find online is a reference in the documents to some change that occurred within Cosmos Collection stating that a "New Index Layout" is being used for newer collections, but there's no other mention of Index Layouts as a concept that I can find anywhere in the docs.

https://docs.microsoft.com/en-us/azure/cosmos-db/index-types#index-kind

Anyone got any idea where I can go from here in terms of debugging/addressing this issue.

like image 585
Eoin Campbell Avatar asked Feb 28 '19 11:02

Eoin Campbell


People also ask

Which two types of index are available in Cosmos?

Indexing mode Azure Cosmos DB supports two indexing modes: Consistent: The index is updated synchronously as you create, update or delete items. This means that the consistency of your read queries will be the consistency configured for the account. None: Indexing is disabled on the container.

What is CosmosDB good for?

Azure Cosmos DB is commonly used within web and mobile applications, and is well suited for modeling social interactions, integrating with third-party services, and for building rich personalized experiences. The Cosmos DB SDKs can be used build rich iOS and Android applications using the popular Xamarin framework.

Is CosmosDB any good?

Cosmos DB is a very good modern platform for cloud native and mobile applications. I work with different teams which are using Cosmos DB. I found that the service is excellent in some areas like scalability, DR, availability.


1 Answers

Your Dev container is newer and using our v2 Index which has significant improvements including on Array_Contains(). To learn more on how to upgrade your PROD container please email us at askcosmosdb at microsoft dot com.

Thanks.

like image 145
Mark Brown Avatar answered Nov 28 '22 08:11

Mark Brown