Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Arrays vs. Maps vs. Subcollections for a set of Objects on Cloud Firestore

I am developing a mobile app in React-Native. Data is stored on Cloud Firestore. I have a question regarding the best way of structuring the data on Cloud Firestore.

The list of contacts, which every user of the app has on his/her mobile phone, is copied to a document on the db. Data is stored in an "array" on Cloud Firestore. An example follows.

For example, for user#1, following are the contents of the document collection("users").doc("1"):

name: "Suzi", 
contacts: [
    {userID:"2", name:"John", appMember: false}, 
    {userID:"3", name:"Cathy", appMember: false} 
]

When one of the contacts (John or Cathy) registers as a user for the app, a Cloud Function needs to check if he/she is a contact for another app user. If he/she is, his/her contact record for that "user" needs to be updated to show that he/she is now an app member.

For example, if John registers as a user for the app, the Cloud Function would update the document above to look like the following.

name: "Suzi", 
contacts: [
    {userID:"2", name:"John", appMember: true},   // <--- appMember changed from false to true
    {userID:"3", name:"Cathy", appMember: false} 
]

I understand that it is NOT possible to access/update John's object {"userID":"2", "name":"John", "appMember": false} directly. The only way of updating this array is (1) "getting" the whole array, (2) loop through it to find the relevant object, (3) update that object, and (4) save the new updated array back to the db.

Can you think of a better data structure for this data?

I thought of storing the contacts in (1) a "map" of contact objects (instead of an array of contact objects), or (2) a "subcollection" of contacts. I think these 2 data structures are more efficient for updating the object of a specific contact. Consequently, these 2 options would make the Cloud Functions more efficient.

On the other hand, I think "arrays" are (1) much easier to add more contacts to, and (2) much easier to display on the UI (using a <FlatList>.) These tasks directly affect users' experience. So, I think these options sounds the most appropriate. Although the Cloud Function would be less efficient, the impact on the user is less (if at all) felt. Moreover, I don't think the cost of executing Cloud Functions has anything to do with the amount of data "processed." I understand cost is mainly based on the number of documents retrieved. Isn't it?

Edit (adding more info about the expected queries and volume of data):

1. Expected Volume:

  • Number of users: 100,000 (but should be scalable to 1 million)

  • Number of contacts for each User: I have no idea! I am guessing most people would have less than 1,000 contacts (this is the size of the array of contacts). To avoid hitting the 1MB limit for the document size, I suppose I can limit the number of contacts to 10,000. I think this would be enough for 99.99% of the potential users.

2. Expected Queries:

2.a Contcats Update - User Query

  • Frequency: Few times a day for each of the 100,000 users

  • Platform: User Queries - NOT Cloud Functions

  • Query: The list of contacts should be updated. This involves: (1) getting a list of contacts from the device, (2) compare this list with the list on the db, and (3) make some changes to the list if differences are found. As mentioned above, the list of contacts is probably around 1,000 contacts. NOTE: it would be much more efficient if I can find a way for getting only "changes" to the device's contacts, rather than processing the whole list each time the user asks for a refresh. However, I could NOT find a way of getting changes (at least using React-Native).

2.b User Regstration - User Query

  • Frequency: 100 - 1,000 times a day

  • Platform: User Queries - NOT Cloud Functions

  • Query: Whenever a user registers on the app, the contacts list, which is stored on the device, is copied to the db. As demonstrated in the example above, only 3 or 5 attributes, e.g. UserID, Name, appMember flag) are actually copied to the db. I should cater for around 100 to 1,000 users to register every day.

2.c User Regstration - Cloud Function

  • Frequency: 100 - 1,000 times a day

  • Platform: Cloud Functions - NOT User Queries

  • Query: As demonstrated at the beginning of this question, whenever a user registers on the app, a Cloud Function needs to check if he/she is a "contact" for another app user. If he/she is, his/her contact record for that "user" needs to be updated to show that he/she is now an app member. (Please, refer to the code at the beginning of this question)

Your effort and time to think about it and provide your feedback is highly appreciated...

like image 977
Bilal Abdeen Avatar asked Oct 02 '19 08:10

Bilal Abdeen


1 Answers

From the sound of your queries, you're going to have a much easier time if each contact was stored in a document of a subcollection for each user rather than an array in a user document. They will be easier to query and modify.

like image 124
Doug Stevenson Avatar answered Sep 29 '22 18:09

Doug Stevenson