Many tiny documents in CosmosDB

Tags:

azure-cosmosdb

I have many (order of 100s) pieces of data that I want to associate with a document in CosmosDB. Each piece of data is small (order of 100s of bytes).

My first solution was to store the data as an array inside the document. This works okay, but in order to append a new item to the array I need to read the document from CosmosDB, add the element, then replace the document back into CosmosDB.

Instead of doing this I would like to store each piece of data as its own document in the same partition. What are the drawbacks of having many tiny documents vs the one aggregated document?

822

asked May 08 '19 00:05

GoodSky

2 Answers

What are the drawbacks of having many tiny documents vs the one aggregated document?

I would like to say that i suggest you storing each piece of data,instead of one aggregated document.

Reason1:As you mentioned in your question,if you want to add the element into the document,you need to read the document from CosmosDB, then replace the document because the partial update is not supported by cosmos db so far.(Please refer to this feedback and follow it if you need:https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/6693091-be-able-to-do-partial-updates-on-document) That's a huge and tedious work.

Reason2:If you store pieces of data,you can query them flat. (select * from c)

For one single array document,you need to use join to access the nested properties.(select a.array from c join array in c.array)

Reason3:If you store pieces of data,you could manage them into different partitions.Even though you don't need it now,why not keep the feature for the future.

Reason4:As to cost,it all depends the RUs and storage and requests to cosmos db will consume RUs. If you store pieces of data,you just need to access the specific document as you want which is more economical i think.

146

answered Jan 03 '23 16:01

Jay Gong

Depends on your use case.

For frequent add operations, you are first reading and updating the document back (2 operations) which will incur you more cost than creating a new document (1 operation).
However, if the documents are having some sort of relationships (like foreign keys in traditional SQL), getting data would require multiple queries if you go with approach #1 above (have more cost) otherwise, you'll get the complete data in a single query (low cost).

I'd recommend to go through this and this posts which will give you better insights on which approach you can choose.

answered Jan 03 '23 17:01

Deepak Agarwal

Related questions
                            
                                CosmosDB SQL query syntax for if statement
                            
                                Cosmos DB SQL API - How to query a field name that uses a reserved word
                            
                                Azure DocumentDb Storage Limits - what exactly do they mean?
                            
                                Unable to connect to Azure Cosmos Db Account using Microsoft.EntityFrameworkCore.Cosmos - Response status code
                            
                                Delete specific document from DocumentDb
                            
                                How to mock DocumentClient CreateDocumentQuery AsDocumentQuery [duplicate]
                            
                                Creating and comparing dates inside CosmosDB stored procedures
                            
                                mongodb spring connection lost overnight
                            
                                Stored procedure azure Cosmos DB returns empty collection
                            
                                Cosmos DB 408 response in Azure Function
                            
                                Understanding the x-ms-resource-usage in DocumentDB response header
                            
                                DocumentDb Emulator not working - Service Unavailable
                            
                                Azure Cosmos DB Mongodb $t and $v
                            
                                Cosmos DB (DocumentDB API): Efficient way to query most recent document by partition ID?
                            
                                Why am I seeing different index behaviour between 2 seemingly identical CosmosDb Collections
                            
                                Azure Cosmos DB - Update existing documents with an additional field

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With