I have many (order of 100s) pieces of data that I want to associate with a document in CosmosDB. Each piece of data is small (order of 100s of bytes).
My first solution was to store the data as an array inside the document. This works okay, but in order to append a new item to the array I need to read the document from CosmosDB, add the element, then replace the document back into CosmosDB.
Instead of doing this I would like to store each piece of data as its own document in the same partition. What are the drawbacks of having many tiny documents vs the one aggregated document?
Select the New Stored Procedure button at the top of the Data Explorer section. In the stored procedure tab, locate the Stored Procedure Id field and enter the value: bulkDelete. This stored procedure iterates through all documents that match a specific query and deletes the documents.
The best way to optimize the RU cost of write operations is to rightsize your items and the number of properties that get indexed. Storing very large items in Azure Cosmos DB results in high RU charges and can be considered as an anti-pattern.
Azure Cosmos DB is schema-agnostic by nature, which is great if we are working with unstructured or semi-structured data within our applications.
Attachments aren't supported in all versions of the Azure Cosmos DB's SDKs. Managed attachments are limited to 2 GB of storage per database account.
What are the drawbacks of having many tiny documents vs the one aggregated document?
I would like to say that i suggest you storing each piece of data,instead of one aggregated document.
Reason1:As you mentioned in your question,if you want to add the element into the document,you need to read the document from CosmosDB, then replace the document because the partial update is not supported by cosmos db so far.(Please refer to this feedback and follow it if you need:https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/6693091-be-able-to-do-partial-updates-on-document) That's a huge and tedious work.
Reason2:If you store pieces of data,you can query them flat. (select * from c)
For one single array document,you need to use join to access the nested properties.(select a.array from c join array in c.array)
Reason3:If you store pieces of data,you could manage them into different partitions.Even though you don't need it now,why not keep the feature for the future.
Reason4:As to cost,it all depends the RUs and storage and requests to cosmos db will consume RUs. If you store pieces of data,you just need to access the specific document as you want which is more economical i think.
Depends on your use case.
For frequent add operations, you are first reading and updating the document back (2 operations) which will incur you more cost than creating a new document (1 operation).
However, if the documents are having some sort of relationships (like foreign keys in traditional SQL), getting data would require multiple queries if you go with approach #1 above (have more cost) otherwise, you'll get the complete data in a single query (low cost).
I'd recommend to go through this and this posts which will give you better insights on which approach you can choose.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With