I was using Firebase realtime database for my test social network app in which you can just follow and receive post of people you follow. A traditional social network. I structured my database something like this-
Users
--USER_ID_1
----name
----email
--USER_ID_2
----name
----email
Posts
--POST_ID_1
----image
----userid
----date
--POST_ID_2
----image
----userid
----date
Timeline
--User_ID_1
----POST_ID_2
------date
----POST_ID_1
------date
I also have another node "Content" which just contained id of the all the user post. So, if "A" followed "B" than all the post id of B where added to A's Timeline. And if B posted something than it was also added to all of its follower's timeline.
Now this was my solution for realtime database but it clearly have some scalability issues
These were some of the problems.
Now, I am thinking to shift this whole thing on firestore as its been claimed "Scalable". So how should I structure my database so that problems I faced in realtime database can be eliminated in firestore.
All Firebase Realtime Database data is stored as JSON objects. You can think of the database as a cloud-hosted JSON tree. Unlike a SQL database, there are no tables or records. When you add data to the JSON tree, it becomes a node in the existing JSON structure with an associated key.
Cloud Firestore is schemaless, so you have complete freedom over what fields you put in each document and what data types you store in those fields. Documents within the same collection can all contain different fields or store different types of data in those fields.
Although Firestore is very affordable with little usage, costs start to increase as you begin to scale. In most cases, the cost of using Firestore at scale is more expensive than other providers.
I've seen your question a little later but I will also try to provide you the best database structure I can think of. So hope you'll find this answer useful.
I'm thinking of a schema that has there three top-level collections for users
, users that a user is following
and posts
:
Firestore-root
|
--- users (collection)
| |
| --- uid (documents)
| |
| --- name: "User Name"
| |
| --- email: "[email protected]"
|
--- following (collection)
| |
| --- uid (document)
| |
| --- userFollowing (collection)
| |
| --- uid (documents)
| |
| --- uid (documents)
|
--- posts (collection)
|
--- uid (documents)
|
--- userPosts (collection)
|
--- postId (documents)
| |
| --- title: "Post Title"
| |
| --- date: September 03, 2018 at 6:16:58 PM UTC+3
|
--- postId (documents)
|
--- title: "Post Title"
|
--- date: September 03, 2018 at 6:16:58 PM UTC+3
if someone have 10,000 followers than a new post was added to all of the 10,000 follower's Timeline.
That will be no problem at all because this is the reason the collections are ment in Firestore. According to the official documentation of modeling a Cloud Firestore database:
Cloud Firestore is optimized for storing large collections of small documents.
This is the reason I have added userFollowing
as a collection and not as a simple object/map that can hold other objects. Remember, the maximum size of a document according to the official documentation regarding limits and quota is 1 MiB (1,048,576 bytes)
. In the case of collection, there is no limitation regarding the number of documents beneath a collection. In fact, for this kind of structure is Firestore optimized for.
So having those 10,000 followers in this manner, will work perfectly fine. Furthermore, you can query the database in such a manner that will be no need to copy anything anywhere.
As you can see, the database is pretty much denormalized allowing you to query it very simple. Let's take some example but before let's create a connection to the database and get the uid
of the user using the following lines of code:
FirebaseFirestore rootRef = FirebaseFirestore.getInstance();
String uid = FirebaseAuth.getInstance().getCurrentUser().getUid();
If you want to query the database to get all the users a user is following, you can use a get()
call on the following reference:
CollectionReference userFollowingRef = rootRef.collection("following/" + uid + "/userFollowing");
So in this way, you can get all user objects a user is following. Having their uid's you can simply get all their posts.
Let's say you want to get on your timeline the latest three posts of every user. The key for solving this problem, when using very large data sets is to load the data in smaller chunks. I have explained in my answer from this post a recommended way in which you can paginate queries by combining query cursors with the limit()
method. I also recommend you take a look at this video for a better understanding. So to get the latest three posts of every user, you should consider using this solution. So first you need to get the first 15 user objects that you are following and then based on their uid
, to get their latest three posts. To get the latest three posts of a single user, please use the following query:
Query query = rootRef.collection("posts/" + uid + "/userPosts").orderBy("date", Query.Direction.DESCENDING)).limit(3);
As you are scrolling down, load other 15 user objects and get their latest three posts and so on. Beside the date
you can also add other properties to your post
object, like the number of likes, comments, shares and so on.
If someone have large amount of posts than every new follower received all of those posts in his Timeline.
No way. There is no need to do something like this. I have already explained above why.
Edit May 20, 2019:
Another solution to optimize the operation in which the user should see all the recent posts of everyone he follow, is to store the posts that the user should see in a document for that user.
So if we take an example, let's say facebook, you'll need to have a document containing the facebook feed for each user. However, if there is too much data that a single document can hold (1 Mib), you need to put that data in a collection, as explained above.
There have two situations
Users in your app have a small number of followers.
Users in your app have a large number of followers. If we are going to store whole followers in a single array in a single document in firestore. Then it will hit the firestore limit of 1 MiB per document.
In the first situation, each user must keep a document which stores the followers' list in a single document in a single array. By using arrayUnion()
and arrayRemove()
it is possible to efficiently manage followers list. And when you are going to post something in your timeline you must add the list of followers in post document.
And use query given below to fetch posts
postCollectionRef.whereArrayContains("followers", userUid).orderBy("date");
In the second situation, you just need to break user following document based on the size or count of followers array. After reaching the size of the array into a fixed size the next follower's id must add into the next document. And the first document must keep the field "hasNext", which stores a boolean value. When adding a new post you must duplicate post document and each document consist of followers list that breaks earlier. And we can make the same query which is given above to fetch documents.
The other answers are going to get very costly if you have any decent amount of activity on your network (e.g. People following 1,000 people, or people making 1,000 posts).
My solution is to add a field to every user document called 'recentPosts', this field will be an array.
Now, whenever a post is made, have a cloud function which detects onWrite(), and updates that poster's recentPosts
array on their userDocument to have info about that post added.
So, you might add the following map to the front of the recentPosts array:
{
"postId": xxxxxxxxxxx,
"createdAt": tttttt
}
Limit the recentPosts array to 1,000 objects, deleting the oldest entry when going over limit.
Now, suppose you are following 1,000 users and want to populate your feed... Grab all 1,000 user documents. This will count as 1k reads.
Once you have the 1,000 documents, each document will have an array of recentPosts
. Merge all of those arrays on client into one master array and sort by createdAt.
Now you have up to potentially 1 million post's docIDs, all sorted chronologically, for only 1,000 reads. Now as your user scrolls their feed simply query those documents by their docID as needed, presumably 10 at a time or something.
You can now load a feed of X posts from Y followers for X + Y
reads.
So 2,000 posts from 100 followers would only be 2,100 reads.
So 1,000 posts from 1,000 followers would only be 2,000 reads.
etc...
Edit 1) further optimization. When loading the userDocuments you can batch them 10 at a time by using the in
query ... normally this would make no difference because it's still 10 reads even though it's batched... but you can also filter by a field like recentPostsLastUpdatedAt
and check that it's greater than your cached value for that user doc, then any user docs that haven't updated their recentPosts array will not get read. This can save you theoretically 10x on base reads.
Edit 2) You can attach listeners to each userDocument too to get new posts as their recentPosts change without querying every single follower each time you need to refresh your feed. (Although 1,000+ snapshot listeners could be bad practice, I don't know how they work under the hood) (Edit3: Firebase limits a project to only 1k listeners so edit2 wasn't a scalable optimization)
I've been struggling bit with the suggested solutions her, mostly due to a technical gap, so i figured another solution that works for me.
For every user I have a document with all the accounts that they follow, but also all a list of all the accounts that follow that user.
When the app starts, I get a hold of the list of accounts that follow this current user, and when a user makes a post, part of the post object is the array of all the users that follow them.
When user B wants too get all the posts of the people they are following, i just ad to the query a simple whereArrayContains("followers", currentUser.uid)
.
I like this approach because it still allows me to order the results by any other parameters I want.
Based on:
This approach should work for users that have up to approx 37,000 followers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With