Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiency of searching using whereArrayContains

I am curious as to the efficiency of searching for documents in a collection using this code. As the number of documents in the collection grows and the number of items in the array grows will this search become very inefficient? Is there a better way of doing this or is there a schema change I can make to the database to better optimize this? Is there somewhere I can find the time complexity of these functions for the firestore documentation maybe?

Query query = db.collection("groups").whereArrayContains("members", userid);


ALTERNATIVE SOLUTION

I originally wanted to try storing the group ids under the user so as to only grab the groups for that current user, but ran into issues and never found a solution for setting a FireStoreRecyclerOptions using multiple ids to query by.

Example:

for(String groupid : list) {
    Query query = db.collection("test-groups").document(groupid);

    FirestoreRecyclerOptions<GroupResponse> response = new FirestoreRecyclerOptions.Builder<GroupResponse>()
            .setQuery(query, GroupResponse.class)
            .build();
}

Is there a way to add multiple queries to the FirestoreRecyclerOptions?

like image 719
PleaseNoBugs Avatar asked Sep 23 '18 19:09

PleaseNoBugs


1 Answers

As the number of documents in the collection grows and the number of items in the array grows will this search become very inefficient?

The problem isn't the fact that the search will become very inefficient, the problem is that the documents have limits. So there are some limits when it comes to how much data you can put into a document. According to the official documentation regarding usage and limits:

Maximum size for a document: 1 MiB (1,048,576 bytes)

As you can see, you are limited to 1 MiB total of data in a single document. When we are talking about storing text, you can store pretty much but as your array gets bigger, be careful about this limitation.

If you are storing a large amount of data in arrays and those arrays should be updated by lots of users, there is another limitation that you need to take care of. So you are limited to 1 write per second on every document. So if you have a situation in which a lot of users al all trying to write/update data to the same documents all at once, you might start to see some of these writes to fail. So, be careful about this limitation too.

As you probably noticed, queries in Cloud Firestore are very fast and this is because Firestore automatically creates an index for any fields you have in your document.

If you think that you'll be querying for a parent based on their containing a specific member of a collection, then use maps and not arrays.

There many posts out there that say that arrays don't work well on Cloud Firestore because when you have data that can be altered by multiple clients, it's very easy to get confused because you cannot know what is happening and on which field. If I'm using a map and users want to edit several different fields, even the exact same field, we generally know what is happening. In arrays, things are different. Try to think what might happen if a user wants to edit a value at index 0, some other user wants to delete the value at index 0 you'll end up having a very different results and why not, array out of bounds exceptions. So Firestore actions with arrays are a little bit different. So you cannot perform actions like, insert, update or delete at a specific index. But if don't care about the exact order that you store element into an array, then you should use arrays. Firestore added a few days ago some features to add or remove specific elements but only if don't care about their exact position. See here official documentation.

In conclusion, put data in the same document only if you need it to display it together. Also, don't make them so big so you'll need to download more data than you actually need. To put data in a collection when you want to search for individual fields of that data or if you want your data to have room to grow. Leave your data as a map field if you want to search your parent object based on that data. And if you got items that you generally use as flags, go ahead with arrays.

Also, don't worry about slow query in Firestore.

like image 106
Alex Mamo Avatar answered Sep 30 '22 18:09

Alex Mamo