I'm looking at firestore but the doc is very light on persistent data on android. My question is about the actions that will take place when the device comes back online.
Let's imagine that I have this low of data online:
MyCollection
| ----------------------- MyDocument (empty)
If I have two devices that synchronize with this model but are offline,
both write in MyDocument, that device A created the field id
with value A
and that my device B also creates a fieldid
with value B
.
Devices A returns first online and then synchronizes with Firestore then device B returns online, problem the field id
already exists with another value.
How will Firestore handle this knowing that it's done in the background automatically? Does it have a chance to catch redundancies to fix itself the problem?
Firestore does not do any conflict resolution in this case. This means that after the entire flow is done, the write from device B will be the one stored. On a database level this is correct, since it has no way to know whether you wanted something else.
If you want to prevent the write from device B, you will have to ensure this. There are a few things you can do:
Putting the write in a transaction is the simplest of the two. However it will fail if the client is offline, since the client in that case can't check for concurrent updates.
Alternatively you can use security rules, which are evaluated on the Firebase servers when the client data reaches the server. If you can detect the fact that the write from device B is invalid/obsolete, you can reject it.
A fairly simple case of this is putting a timestamp in every write operation, and then rejecting writes where the new timestamp is before the one in the database. This ensures that if a change is made, the last one that was made is stored (instead of the last one to come back online).
As you can see both of these are non-trivial solutions to the problem. That's why my first recommendation would be to find a data model that prevent conflicts altogether. By avoiding conflicting writes, you can prevent having to solve the conflicts. A good example of doing this is by not letting the clients update the actual document, but having them store their changes as separate pieces of data without either of them overwriting anything.
MyCollection
| ---- MyDocument (empty)
| -------- Change from device A
| -------- Change from client B
Now the entire trail of what happened is stored in the database, and each client can use this information to construct the actual document based on whatever business rules you have.
This is actually a quite common approach in NoSQL databases, especially in projects that require massively concurrent writes. By preventing conflicting writes and using an append-only data model, the scalability improves immensely. It is also the model used behind the "op log" files of many (relational) databases. By keeping a log of all operations, they can reconstruct the database from the start.
In many cases you'll then also store occasional snapshots of the document state, to make it faster to determine the current document. This can either be done by a random client, or by a trusted process such as on a server you control or Cloud Functions for Firebase.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With