I have a subcollection for each doc in the users collection of my app. This subcollection stores docs that are related to the user, however they could just as well be saved to a master collection, each doc with an associated userId.
I chose this structure as it seemed the most obvious at the time but I can imagine it will make things harder down the road if I need to do database maintenance. E.g. If I wanted to clean up those docs, I would have to query each user and then each users docs, whereas if I had a master collection I could just query all docs.
That lead me to question what is the point of subcollections at all, if you can just associate those docs with an ID. Is it solely there so that you can expand if your doc becomes close to the 1MB limit?
A subcollection is a collection associated with a specific document. Note: You can query across subcollections with the same collection ID by using Collection Group Queries. You can create a subcollection called messages for every room document in your rooms collection: collections_bookmark rooms.
< T > A DocumentReference refers to a document location in a Firestore database and can be used to write, read, or listen to the location. The document at the referenced location may or may not exist. A DocumentReference can also be used to create a CollectionReference to a subcollection.
You can listen to a document with the onSnapshot() method. An initial call using the callback you provide creates a document snapshot immediately with the current contents of the single document. Then, each time the contents change, another call updates the document snapshot.
In Firestore, you typically have a single collection and multiple documents in that collection. To use Firestore as a key/value store for Workflows, one idea is to use the workflow name as the collection name and use a single document to store all the key/value pairs.
Edit: October, 29th 2021:
To be clear about the following sentence that exists in the docs:
If you don't query based on the field with sequential values.
A timestamp just can not be considered consecutive. However, it still can be considered sequential. The same rules apply to alphabetical (Customer1, Customer2, Customer3, ...), or pretty much everything that can be treated as a predictably generated value.
Such sequential data in the Firestore indexes, it's most likely to be written in the physical proximity on the storage media, hence that limitation.
That being said, please note that Firestore uses a mechanism to map the documents to their corresponding locations. This means that if the values are not randomly distributed, the write operations will not be distributed correctly over the locations. That's the reason why that limitation exists.
Also note, that there is a physical limit on how much data you can write to such a location in a specific amount of time. Predictable key/values most likely will end up in the same location, which is actually bad. So there are more changes to reach the limitation.
Edit: July, 16th 2021:
Since this answer sounds a little old, I will try to add a few more advantages of using subcollections that I found over time:
That's for the moment, if I found other benefits, I'll update the answer.
Let's take an example for that. Let's assume we have a database schema for a quiz app that looks like this:
Firestore-root | --- questions (collections) | --- questionId (document) | --- questionId: "LongQuestionIdOne" | --- title: "Question Title" | --- tags (collections) | --- tagIdOne (document) | | | --- tagId: "yR8iLzdBdylFkSzg1k4K" | | | --- tagName: "History" | | | --- //Other tag properties | --- tagIdTwo (document) | --- tagId: "tUjKPoq2dylFkSzg9cFg" | --- tagName: "Geography" | --- //Other tag properties
In which tags
is a subcollection within questionId
object. Let's create now the tags
collection as a top-level collection like this:
Firestore-root | --- questions (collections) | | | --- questionId (document) | | | --- questionId: "LongQuestionIdOne" | | | --- title: "Question Title" | --- tags (collections) | --- tagIdOne (document) | | | --- tagId: "yR8iLzdBdylFkSzg1k4K" | | | --- tagName: "History" | | | --- questionId: "LongQuestionIdOne" | | | --- //Other tag properties | --- tagIdTwo (document) | --- tagId: "tUjKPoq2dylFkSzg9cFg" | --- tagName: "Geography" | --- questionId: "LongQuestionIdTwo" | --- //Other tag properties
The differences between this two approaches are:
tags
of a particular question, using the first schema it's very easy because only a CollectionReference
is needed (questions -> questionId -> tags). To achieve the same thing using the second schema, instead of a CollectionReference
, a Query
is needed, which means that you need to query the entire tags
collection to get only the tags that correspond to a single question.This technique is called database flatten and is a quite common practice when it comes to Firebase. So use this technique only if is needed. So in your case, if you only need to display the tags of a single question, use the first schema. If you want somehow to display all the tags of all questions, the second schema is recommended.
Is it solely there so that you can expand if your doc becomes close to the 1MB limit?
If you have a subcollection of objects within a document, please note that size of the subcollection it does not count in that 1 MiB limit. Only the data that is stored in the properties of the document is counted.
Edit Oct 01 2019:
According to @ShahoodulHassan comment:
So there is no way you can get all the tags of all the questions using the first schema?
Actually now there is, we can get all tags of all questions with the use of Firestore collection group query. One thing to note is that all the subcolletions must have the same name, for instance tags
.
The single biggest advantage of sub-collections that I've found is that they have their own rate limit for writes because each sub-collection has its own index (assuming you don't have a collection group index). This probably isn't a concern for small applications but for medium/large scale apps it could be very important.
Imagine a chat application where each chat has a series of messages. You'll want to index messages by timestamp to show them in chronological order. The Firestore write limit for sequential values is 500/second, which is definitely within reach of a medium-sized app (especially if you consider the possibility of a rogue user scripting messages -- which is not currently easy to prevent with Security Rules)
// root collection /messages { chatId: string timeSent: timestamp // the entire app would be limited to 500/second }
// sub-collection /chat/{chatId}/messages { timeSent: timestamp // each chat could safely write up to 500/second }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With