I have a cloud function that is triggered by a Firestore database write. It does an async operation (fetch data from some 3rd party API's) that may take a long time, might not. When it's finished, it writes the result to a 'search result' field.
There's a possible race condition where the result from a newer trigger gets overwritten by an older operation that finishes this later. How should I solve this problem in the context of Firebase cloud functions and Firestore?
In general there are two approaches here:
This is often the most scaleable and architecturally simple. When you perform the same idempotent operation on the same input, it has the same result. This means that it doesn't matter if the operation is performed multiple times, as the result will be the same.
A good example of this is in the Firestore documentation on arrays and sets. Imagine that you're tagging blog posts with categories. A naïve model for this would be:
{
title: "My great post",
categories: [
"technology",
"opinion",
"cats"
]
}
But now imagine that two users are tagging the same post as being about cats at almost the same time. You might end up with
{
title: "My great post",
categories: [
"technology",
"opinion",
"cats",
"cats"
]
}
Which is clearly not what you wanted. But since the data structure allows it, this may happen. The ideal solution here is to use a data structure that makes this impossible: a data structure where adding cat
is an idempotent operation. In mathematical terms this would be a set, and in Firestore you'd model that as:
{
title: "My great post",
categories: {
"technology": true,
"opinion": true,
"cats": true
}
}
Now in this structure, it doesn't matter how often you set cats
to true
, the result will always be the same.
Sometimes it isn't possible (or feasible) to make your operations idempotent. In that case, you can also consider using a compare-and-set strategy.
For example, say that the 3rd party API changes the data in some way, and that you want to only write the result back to the database if the original data in the database is unmodified. In that case you'll want to take these steps in your function:
This type of compare-and-set operation is actually how Firebase's Realtime Database implements transactions, with the "3rd party API" being your applications transaction handler.
As you can probably see this second approach is more complex than the approach with idempotent operations. So when possible, I'd always recommend that approach.
Although not a direct answer to the OP's question regarding waiting for a 3rd party response, for those who came here from a Google search for a generalized Firestore race condition solution, Firestore supports both transactions and batched writes:
Cloud Firestore supports atomic operations for reading and writing data. In a set of atomic operations, either all of the operations succeed, or none of them are applied. There are two types of atomic operations in Cloud Firestore:
Transactions: a transaction is a set of read and write operations on one or more documents.
Batched Writes: a batched write is a set of write operations on one or more documents.
For more details, see: https://firebase.google.com/docs/firestore/manage-data/transactions
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With