Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How quickly can you atomically increment a value on the Firebase Realtime Database?

firebaser here

When I recently tweeted about the new increment() operator in the Firebase Realtime Database, a team mate asked how fast increment() is.

I've been wondering the same: how fast can you increment a value with increment(1)? And how does that compare to using a transaction to increment a value?

like image 774
Frank van Puffelen Avatar asked May 01 '20 03:05

Frank van Puffelen


People also ask

Does Firebase automatically scale?

Lastly, it'll scale massively and automatically. Firebase Functions will just do it automatically for you based on the traffic you receive and at an incredibly low cost.

Which is faster Realtime Database or firestore?

Cloud Firestore also features richer, faster queries and scales further than the Realtime Database.

What are the Realtime Database limits in Firebase?

256 MB from the REST API; 16 MB from the SDKs. The total data in each write operation should be less than 256 MB. Multi-path updates are subject to the same size limitation. The total bytes written through simultaneous write operations on the database at any given time.


Video Answer


1 Answers

TL;DR

I tested these cases:

  1. Increment a value with a transaction call:

    ref.transaction(function(value) {
      return (value || 0) + 1;
    });
    
  2. Increment a value with the new increment operator:

    ref.set(admin.database.ServerValue.increment(1));
    

The fact that increment is faster won't be a surprise, but... by how much?

Results:

  • With transactions I was able to increment a value about 60-70 times per second.
  • With the increment operator, I was able to increment a value about 200-300 times per second.

How I performed the test and got these numbers

I've run the test on my 2016 model macBook pro, and wrapping the above in a simple Node.js script that uses the client-side Node SDK. The wrapping script for the operations was really basic as well:

timer = setInterval(function() {
    ... the increment or transaction from above ...
}, 100);

setTimeout(function() {
  clearInterval(timer);
  process.exit(1);
}, 60000)

So: increment the value 10 times per second, and stop doing that after 1 minute. I then spawned instances of this process with this script:

for instance in {1..10}
do
  node increment.js &
done

So this would run 10 parallel processes with the increment operator, each increasing the value 10 times per second, for a total of 100 increments per second. I then changed the number of instances until the "increments per second" reached a peak.

I then wrote a small script on jsbin to listen for the value, and determine the number of increments per second by a simple low pass, moving average filter. I had some trouble here, so am not sure if the calculations are completely correct. Given my test results they were close close enough, but if anyone feels like writing a better observer: be my guest. :)

Things to note about the tests:

  1. I kept increasing the number of processes, until the "increments per second" seemed to max out, but I noticed that this coincided with my laptop fans going full-speed. So it's likely that I didn't find the true maximum throughput of the server-side operation, but a combination of my test environment and the server. So it is quite possible (and in fact likely) you may get different results when you try to reproduce this test, although of course the increment throughput should always be significantly higher than the transaction. No matter what results you get: please share them. :)

  2. I've used the client-side Node.js SDK, as it was easiest to get working. Using different SDKs may give slightly different results, although I expect the primary SDKs (iOS, Android, and Web) to be quite close to what I got.

  3. Two different team mates immediately asked whether I'd run this on a single node, or if I was incrementing multiple values in parallel. Incrementing multiple values in parallel might show if there's a system-wide throughput bottleneck in or if it is node-specific (which I expect).

  4. As said already: my test harness is nothing special, but my jsbin observer code is especially suspect. Kudos if anyone feels like coding up a better observer on the same data.


How the transaction and increment operator work under the hood

To understand the performance difference between transaction and increment it really helps to know how these operations work under the hood. For the Firebase Realtime Database "under the hood" means, the commands and responses that are sent between the clients and server over the Web Socket connection.

Transactions in Firebase use a compare-and-set approach. Whenever we start transaction like above, the client takes a guess at the current value of the node. If it's never see the node before that guess is null. It calls our transaction handler with that guess, and our code then returns the new value. The client send the guess and the new value to the server, which performs a compare-and-set operation: if the guess is correct, set the new value. If the guess is wrong, the server rejects the operation and returns the actual current value to the client.

In a perfect scenario, the initial guess is correct, and the value is immediately written to disk on the server (and after that, sent out to all listeners). In a flow chart that'd look like this:

            Client            Server

               +                   +
 transaction() |                   |
               |                   |
        null   |                   |
     +---<-----+                   |
     |         |                   |
     +--->-----+                   |
         1     |     (null, 1)     |
               +--------->---------+
               |                   |
               +---------<---------+
               |     (ack, 3)      |
               |                   |
               v                   v

But if the node already has a value on the server, it rejects the write, sends back the actual value, and the client tries again:

            Client            Server

               +                   +
 transaction() |                   |
               |                   |
        null   |                   |
     +---<-----+                   |
     |         |                   |
     +--->-----+                   |
         1     |                   |
               |     (null, 1)     |
               +--------->---------+
               |                   |
               +---------<---------+
               |     (nack, 2)     |
               |                   |
         2     |                   |
     +---<-----+                   |
     |         |                   |
     +--->-----+                   |
         3     |      (2, 3)       |
               +--------->---------+
               |                   |
               +---------<---------+
               |      (ack, 3)     |
               |                   |
               |                   |
               v                   v

This isn't too bad, one extra roundtrip. Even if Firebase would've used pessimistic locking, it would have needed that roundtrip, so we didn't lose anything.

The problem starts if multiple clients are modifying the same value concurrently. This introduces so-called contention on the node, which looks like this:

            Client            Server                Client
               +                   +                   +
 transaction() |                   |                   |
               |                   |                   | transaction()
        null   |                   |                   |
     +---<-----+                   |                   |  null
     |         |                   |                   +--->----+
     +--->-----+                   |                   |        |
         1     |                   |                   +---<----+ 
               |     (null, 1)     |                   |   1
               +--------->---------+    (null, 1)      |
               |                   |---------<---------+
               +---------<---------+                   |
               |     (nack, 2)     |--------->---------+
               |                   |     (nack, 2)     |
         2     |                   |                   |
     +---<-----+                   |                   |   2
     |         |                   |                   |--->----+
     +--->-----+                   |                   |        |
         3     |      (2, 3)       |                   |---<----+ 
               +--------->---------+                   |   3
               |                   |                   |
               +---------<---------+                   |
               |      (ack, 3)     |       (2, 3)      |
               |                   |---------<---------+
               |                   |                   |
               |                   |--------->---------+
               |                   |    (nack, 3)      |
               |                   |                   |   3
               |                   |                   |--->----+
               |                   |                   |        |
               |                   |                   |---<----+ 
               |                   |                   |   4
               |                   |       (3, 4)      |
               |                   |---------<---------+
               |                   |                   |
               |                   |--------->---------+
               |                   |     (ack, 4)      |
               |                   |                   |
               v                   v                   v

TODO: Update the above chart so that the operations on the server don't overlap.

The second client had to do another retry for its operation, because the server-side value had been modified between its first and second try. The more clients we have writing to this location, the more likely it is that you'll see retries. And the Firebase client performs those retries automatically, but after a number of retries it will give up and raise an Error: maxretry exception to the application.

This is the reason I could only increment a counter about 60-70 times per second: with more writes than that, there was too much contention on the node.

An increment operation is atomic by nature. You're telling the database: whatever the current value is, make it x higher. This means that the client never has to know the current value of the node, and so it also can't guess wrong. It simply tells the server what to do.

Our flow chart with multiple clients looks like this when using increment:

            Client            Server                Client

               +                   +                   +
  increment(1) |                   |                   |
               |                   |                   | increment(1)
               |  (increment, 1)   |                   |
               +--------->---------+   (increment, 1)  |
               |                   |---------<---------+
               +---------<---------+                   |
               |      (ack, 2)     |--------->---------+
               |                   |     (ack, 3)      |
               |                   |                   |
               v                   v                   v

The length of these last two flow charts alone already goes a long way to explain why increment is so much faster in this scenario: the increment operation is made for this, so the wire protocol much more closely represents what we're trying to accomplish. And that simplicity leads to a 3x-4x performance difference in my simple test alone, and probably even more in production scenarios.

Of course transactions are still useful, as there are many more atomic operations than just increments/decrements.

like image 139
Frank van Puffelen Avatar answered Sep 18 '22 03:09

Frank van Puffelen