In the scenario where we have 1000 entries (unique keys) entering cosmos per minute, is it safe to use /id as the partition key?
In particular, there is the concept of Logical Partitions https://docs.microsoft.com/en-us/azure/cosmos-db/partition-data The graphic here scares me a little bit, showing that the logical partitions are actual entities (Ex. "city": "London"). If I have an 8 hour TTL and 1000 entries per minute, I don't necessarily want 480,000 logical partitions that cosmos needs to manage.
What I imagine happens is that the value of the partition key is simply hashed and modulo with the number of physical partitions, ex. https://docs.microsoft.com/en-us/azure/cosmos-db/partitioning-overview#choose-partitionkey indicates that this is true in the "Logical Partition Mangement" section. Furthermore, the "Choosing a Partition Key" section suggests (but does not actually state) that /id would be a fantastic partition key, as it doesn't have to worry about the 10GB limit, throughput limit, no hot spots, wide (huge) range of values, and since the application doesnt need to filter on anything except the id, cross partition queries wont be an issue for this use case.
In summary, do I need to worry about the memory/CPU/etc overhead of hundreds of thousands of partition key values (logical partitions)? The docs indicate the more values of the partition key is better, but don't say if its possible to have too many values.
Every resource within an Azure Cosmos DB database account needs to have a unique identifier.
In this lab, you will create multiple Azure Cosmos DB containers. Some of the containers will be unlimited and configured with a partition key, while others will be fixed-sized. You will then use the SQL API and Java Async SDK to query specific containers using a single partition key or across multiple partition keys.
Cosmos DB Partition key Best Practices An item ID in a container is one of the best choice as the partition key because it is unique and has a wide range of possible values and no chance of duplication.
Azure Cosmos DB uses partitioning to scale individual containers in a database to meet the performance needs of your application. In partitioning, the items in a container are divided into distinct subsets called logical partitions.
I am from the Cosmos DB engineering team.
You don't have to worry about the number of logical partition keys that are created on a Cosmos DB collection/container. As long as the partition key is an appropriate choice for your writes (subject to a per-logical partition key cap of 10GB) and queries, you should be good.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With