Stack:
Ionic
Nodejs/Express
Cloud Firestore
I am tasked with writing an app that takes dates in "day" format, with a balance for that day, and displaying that data in a chart using Chart.js. There are interval buttons that allow you to change between "day", "week", and "month" that is supposed to group the dates into respective intervals.
This currently works fine using 1 collection. "days" and "weeks" both work, but once we get to "month" with large amounts of data Firestore kills itself in my backend. The amount of data it tries to poll is too large. I currently run aggregation for "weeks" and "months" in the backend using the "days".
The only aggregation documentation I could find in the docs was: https://firebase.google.com/docs/firestore/solutions/aggregation which doesn't give me a result, it stores it in a collection which doesn't help me. The app can change the balance on a single date which causes a ripple effect in the balances after the fact - so I have to generate the values on interval change.
Does something like this exist or am I stuck with creating 3 separate collections, days/weeks/months and polling the desired collection?
Update: since October 2022 Firestore supports counting documents with an aggregation query, which looks like this in JavaScript:
const coll = collection(db, "cities");
const snapshot = await getCountFromServer(coll);
console.log('count: ', snapshot.data().count);
You can also use a query to limit what documents are counted, like this:
const coll = collection(db, "cities");
const query_ = query(coll, where('state', '==', 'CA'));
const snapshot = await getCountFromServer(query_);
console.log('count: ', snapshot.data().count);
When you use the count()
operation you are charged 1 document read for each up to 1,000 documents that you count, with a minimum of 1 document read for each count()
operation. The count operation has a maximum execution time of 60 seconds after which it times out. For a performance test, see How fast is counting documents in Cloud Firestore?
For performance and cost reasons you'll typically want to still use an alternative approach when counting large numbers of items across many users, so I'm leaving my previous answer below.
Update: since late 2023 it is also possible to calculate sums and averages across multiple Firestore documents at read-time. I recommend checking out the documentation on summarizing data with aggregation querie and my post: How should I handle aggregated values in Firestore
Previous answer 👇
From the docs you linked:
Cloud Firestore does not support native aggregation queries.
So that pretty much answers the question in your title: Firestore does not have a built-in capability to run aggregations on the database server.
The common solutions are to:
Run the aggregations on the client
It sounds like this is what you're doing now: you're downloading all data for the internal, and then aggregate it on the client. This approach can work well for small data sets, but if you're only displaying the aggregates in the client, you are likely downloading much more data than needed. That's why you should consider alternatives if you data set may be large, which it typically will be(come) when you use Firestore.
Update aggregates every time the data changes
In this scenario you store the aggregated value in the database and update it whenever you write a value that is aggregated over. The documentation shows an example of calculating a moving average this way, which requires no query at all, and thus scales to any sized data set.
In this scenario you'll have to keep in mind that Firestore is limited to performing roughly one write per document per second. So if your receiving more data than that, you may need to distribute your aggregation query as shown in the documentation on distributed counters.
Use another database for your aggregation queries
Another alternative is to use Firestore for storing the data that the clients read, but using another database for the complex, dynamic queries.
A typical example of this is to export the data from Firestore to BigQuery, then perform the calculations in BigQuery, and write the results back to Firestore so that the clients can read them. Here you're using both product for what they're best at: Firestore for serving data at scale, and BigQuery for processing data at scale.
Firestore has recently released aggregation queries which basically lets you perform count()
operation on your Firestore collections.
Here's an example taken from their documentation:
const collectionRef = db.collection('cities');
const snapshot = await collectionRef.count().get();
console.log(snapshot.data().count);
// expected output: amount of cities in my collection
There are several limits to count()
in Firestore, I'll mention two:
count()
shall resolve in less than 60 seconds. Otherwise it throws an errorcount()
has read 2000 documents, it will cost you 2 document read.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With