I have a Firebase Cloud Function that is triggered by an <code>onCreate</code> at a path in my Realtime database. Yesterday, after doing some basic testing, I began to get "quota exceeded" errors in the Cloud Functions Log. What was troubling though, is that the error kept coming, first every 2 seconds or so, then ramping up to every 8 seconds or so, for about 1.5 hours. Here's a small segment of the trouble: <img src="https://i.stack.imgur.com/nH7Dg.png" alt="Errors. Errors. More Errors. *Sigh*"> I took a look at the documentation covering Retrying Asynchronous Functions and it seems clear that in most cases a function will stop executing and the event will be discarded if an error occurs. The triggering event did not seem to be getting cleared in my case, perhaps because the error I was getting was coming from outside of my function. Here's the full error text: <blockquote> Error: quota exceeded (DNS resolutions : per 100 seconds); to increase quotas, enable billing in your project at https://console.cloud.google.com/billing?project=myproject. Function cannot be executed. </blockquote> Yes, I realize that a quota issue is easily solved by moving to the Blaze plan, which I will be doing soon, but before I do, I'd like to understand how to handle this case should it happen in the future. I had to re-deploy my functions to get the Error to stop happening. After studying the docs, it sounds as though I also could have deleted the function to stop the error, but neither re-deploying or deleting functions seems like a great path to take once my app reaches production. My thought is that I must have been stuck in a retry loop internal to the Firebase Cloud function service. None of my logging was being hit (some of which would happen on function start, and some during a function run), plus the message says "Function cannot be executed", which to me means that the Error happened before my code was ever executed. Another function which is triggered by the same event was logging an identical Error. So, to anyone in the know, my questions are as follows: <ol> <li>For cases where functions seem to be stuck in a loop, is redeployment or deletion the ONLY path to recovery? What other possible approaches might I have taken?</li> <li>Is there a way I can handle an Error that is coming from outside my implementation? Can an Error like this be raised by the service or do Errors ONLY originate in my code? My guess is that I'm limited to monitoring the logs for such failures, but perhaps there are other possible actions that can be taken to make my service more robust.</li> </ol> Finally, a few further notes to address follow-up questions some may have: <ul> <li>Yes, the repeating Error did prevent any new invocations of the function in question to take place. Other functions could still be executed, except for the one other that is triggered by the same event.</li> <li>Yes, I can recreate the behavior to an extent. If I set the DNS Resolutions quota per 100 seconds to a low value (5, for example), and do a few executions, I get the same error, repeating every 8 seconds or so. Oddly though, when I recreate in this manner, it recovers on its own after about 10-20 throws, seemingly around the time the 100 seconds has elapsed, which makes sense. In the case of the original incident, the errors repeated for more than an hour, at which point I decided to re-deploy, stopping them.</li> <li>Yes, I do see the most recent <code>Function execution took ### ms, finished with status 'ok'</code> message in my logs before the Errors start rolling in. After the Errors are done repeating, I do see a the last-in triggering data get processed by the function successfully. This is what caused me to wonder if it was the triggering event that was causing Cloud Functions to keep trying.</li> <li>No, my function does not write to a location that would retrigger itself.</li> <li>Yes, I realize I could easily resolve this by moving to the Blaze plan to get a much higher quota. I will be doing that. First though, I'd like to understand the mechanics of what has gone wrong so I can make any available improvements.</li> </ul>

After going back and forth with Firebase support (who were excellent, btw) for the past month, I finally have some answers to my questions. I thought I'd go ahead and answer my own question, hopefully to benefit others who may encounter a similar issue. First, an important tidbit that I've learned along the way: In the case of quota errors, the triggering event of an asynchronous Firebase Cloud Function is NOT removed from the event queue. This is different behaviour than other types of errors which DO remove the triggering event from the queue. According to the Firebase engineers, this approach to handling quota errors is by design, primarily to address the case where it is a per 100 second quota that is being hit. In these cases, it may often make sense to retry as a successful run may be < 100 seconds away. So, to answer my specific questions... <blockquote> For cases where functions seem to be stuck in a loop, is redeployment or deletion the ONLY path to recovery? What other possible approaches might I have taken? </blockquote> Redeployment or deletion is the only option. If a function is stuck in an Error loop, redeploying the function will clear the event queue, resolving the issue. I was able to test this resolution approach, and it worked just fine. The Firebase engineers were also able to confirm that a partial deploy is perfectly acceptable if only a subset of the functions are encountering quota errors. This resolution approach strikes me as being a bit kludgy, so hopefully future improvements to the Cloud Functions platform will provide an option for developers to manually clear a function's event queue. Fingers crossed. <blockquote> Is there a way I can handle an Error that is coming from outside my implementation? Can an Error like this be raised by the service or do Errors ONLY originate in my code? </blockquote> Basically, NO. The quota errors originate on the service, not in the deployed code, so they can't be caught and handled in code. For cases like mine, the best readily available option is to make use of the platform's excellent error reporting features to identify issues early and take corrective action. Some other notes... For anyone wanting to see this behaviour themselves, here's a copy of the MCVE I provided to Firebase Support which can be used to replicate the behaviour. index.js <pre class="prettyprint"><code>const functions = require('firebase-functions'); const admin = require('firebase-admin'); admin.initializeApp(functions.config().firebase); const database = admin.database(); exports.helloWorld = functions.database.ref("/requests/" + "{pushKey}") .onCreate(event => { console.log("helloWorld: Triggered with pushKey: ", event.params.pushKey); //Say hello console.log("Hello, world! Don't panic!"); let result = {}; result["name"] = event.data.val().name; return database.ref("/responses/" + `${event.params.pushKey}/`).set(result) .then(() => { console.log("Response written to: " + `${event.params.pushKey}/`); return; }) .catch((writeError) => { console.error("Response write failed at: " + `${event.params.pushKey}/`); return; }); }); </code></pre> package.json <pre class="prettyprint"><code>{ "name": "functions", "description": "Cloud Functions for Firebase", "scripts": { "lint": "./node_modules/.bin/eslint .", "serve": "firebase serve --only functions", "shell": "firebase experimental:functions:shell", "start": "npm run shell", "deploy": "firebase deploy --only functions", "logs": "firebase functions:log" }, "dependencies": { "firebase-admin": "~5.8.1", "firebase-functions": "^0.8.1" }, "devDependencies": { "eslint": "^4.12.0", "eslint-plugin-promise": "^3.6.0" }, "private": true } </code></pre> Input data <pre class="prettyprint"><code>{ "requests" : [ null, { "name" : "Ford Prefect" }, { "name" : "Arthur Dent" }, { "name" : "Trillian" }, { "name" : "Zaphod Beeblebrox" }, { "name" : "Marvin" }, { "name" : "Slartibartfast" }, { "name" : "Deep Thought" }, { "name" : "Zarniwoop" }, { "name" : "Fenchurch" }, { "name" : "Babel Fish" }, { "name" : "Tricia McMillan" } ] } </code></pre> To test, just use the GCP Console to turn down the quota limit for function invocations per 100 seconds for your Firebase project. Set it to a low value, like 4. Then, import the test data and watch the Errors roll in. Several invocations will succeed (not always 4 of them), then a batch of repeating errors will come in until the 100 second quota rolls over, then you'll get more successful results. Interestingly, Google doesn't seem to use a FIFO queue for the events, so if you enter the records manually, don't expect them to necessarily be processed in the order they were entered. All said, under a paid plan (which I've already upgraded to), the quotas are quite high so it's unlikely I'll crash into this issue again. However, it was still good to learn about how quota errors are handled in case I need to engineer items that may hit a quota in the future. Thanks again to the great folks at Firebase Support for all the help.

Firebase Cloud Function Repeatedly Fails Due to Quota Error

Tags:

firebase

google-cloud-functions

I have a Firebase Cloud Function that is triggered by an onCreate at a path in my Realtime database. Yesterday, after doing some basic testing, I began to get "quota exceeded" errors in the Cloud Functions Log. What was troubling though, is that the error kept coming, first every 2 seconds or so, then ramping up to every 8 seconds or so, for about 1.5 hours.

Here's a small segment of the trouble:

Errors. Errors. More Errors. *Sigh*

I took a look at the documentation covering Retrying Asynchronous Functions and it seems clear that in most cases a function will stop executing and the event will be discarded if an error occurs. The triggering event did not seem to be getting cleared in my case, perhaps because the error I was getting was coming from outside of my function. Here's the full error text:

Error: quota exceeded (DNS resolutions : per 100 seconds); to increase quotas, enable billing in your project at https://console.cloud.google.com/billing?project=myproject. Function cannot be executed.

Yes, I realize that a quota issue is easily solved by moving to the Blaze plan, which I will be doing soon, but before I do, I'd like to understand how to handle this case should it happen in the future. I had to re-deploy my functions to get the Error to stop happening. After studying the docs, it sounds as though I also could have deleted the function to stop the error, but neither re-deploying or deleting functions seems like a great path to take once my app reaches production.

My thought is that I must have been stuck in a retry loop internal to the Firebase Cloud function service. None of my logging was being hit (some of which would happen on function start, and some during a function run), plus the message says "Function cannot be executed", which to me means that the Error happened before my code was ever executed. Another function which is triggered by the same event was logging an identical Error.

So, to anyone in the know, my questions are as follows:

For cases where functions seem to be stuck in a loop, is redeployment or deletion the ONLY path to recovery? What other possible approaches might I have taken?
Is there a way I can handle an Error that is coming from outside my implementation? Can an Error like this be raised by the service or do Errors ONLY originate in my code? My guess is that I'm limited to monitoring the logs for such failures, but perhaps there are other possible actions that can be taken to make my service more robust.

Finally, a few further notes to address follow-up questions some may have:

Yes, the repeating Error did prevent any new invocations of the function in question to take place. Other functions could still be executed, except for the one other that is triggered by the same event.
Yes, I can recreate the behavior to an extent. If I set the DNS Resolutions quota per 100 seconds to a low value (5, for example), and do a few executions, I get the same error, repeating every 8 seconds or so. Oddly though, when I recreate in this manner, it recovers on its own after about 10-20 throws, seemingly around the time the 100 seconds has elapsed, which makes sense. In the case of the original incident, the errors repeated for more than an hour, at which point I decided to re-deploy, stopping them.
Yes, I do see the most recent Function execution took ### ms, finished with status 'ok' message in my logs before the Errors start rolling in. After the Errors are done repeating, I do see a the last-in triggering data get processed by the function successfully. This is what caused me to wonder if it was the triggering event that was causing Cloud Functions to keep trying.
No, my function does not write to a location that would retrigger itself.
Yes, I realize I could easily resolve this by moving to the Blaze plan to get a much higher quota. I will be doing that. First though, I'd like to understand the mechanics of what has gone wrong so I can make any available improvements.

906

asked Jan 15 '18 21:01

HondaGuy

1 Answers

After going back and forth with Firebase support (who were excellent, btw) for the past month, I finally have some answers to my questions. I thought I'd go ahead and answer my own question, hopefully to benefit others who may encounter a similar issue.

First, an important tidbit that I've learned along the way:

In the case of quota errors, the triggering event of an asynchronous Firebase Cloud Function is NOT removed from the event queue. This is different behaviour than other types of errors which DO remove the triggering event from the queue. According to the Firebase engineers, this approach to handling quota errors is by design, primarily to address the case where it is a per 100 second quota that is being hit. In these cases, it may often make sense to retry as a successful run may be < 100 seconds away.

So, to answer my specific questions...

For cases where functions seem to be stuck in a loop, is redeployment or deletion the ONLY path to recovery? What other possible approaches might I have taken?

Redeployment or deletion is the only option. If a function is stuck in an Error loop, redeploying the function will clear the event queue, resolving the issue. I was able to test this resolution approach, and it worked just fine. The Firebase engineers were also able to confirm that a partial deploy is perfectly acceptable if only a subset of the functions are encountering quota errors.

This resolution approach strikes me as being a bit kludgy, so hopefully future improvements to the Cloud Functions platform will provide an option for developers to manually clear a function's event queue. Fingers crossed.

Is there a way I can handle an Error that is coming from outside my implementation? Can an Error like this be raised by the service or do Errors ONLY originate in my code?

Basically, NO. The quota errors originate on the service, not in the deployed code, so they can't be caught and handled in code. For cases like mine, the best readily available option is to make use of the platform's excellent error reporting features to identify issues early and take corrective action.

Some other notes...

For anyone wanting to see this behaviour themselves, here's a copy of the MCVE I provided to Firebase Support which can be used to replicate the behaviour.

index.js

const functions = require('firebase-functions');
const admin = require('firebase-admin');

admin.initializeApp(functions.config().firebase);

const database = admin.database();

exports.helloWorld = functions.database.ref("/requests/" + "{pushKey}")
.onCreate(event  => {
    console.log("helloWorld: Triggered with pushKey: ", event.params.pushKey);

    //Say hello
    console.log("Hello, world! Don't panic!");

    let result = {};
    result["name"] = event.data.val().name;
    return database.ref("/responses/" + `${event.params.pushKey}/`).set(result)
    .then(() => {
        console.log("Response written to: " + `${event.params.pushKey}/`);
        return;
    })
    .catch((writeError) => {
        console.error("Response write failed at: " + `${event.params.pushKey}/`);
        return;
    });

});

package.json

{
  "name": "functions",
  "description": "Cloud Functions for Firebase",
  "scripts": {
    "lint": "./node_modules/.bin/eslint .",
    "serve": "firebase serve --only functions",
    "shell": "firebase experimental:functions:shell",
    "start": "npm run shell",
    "deploy": "firebase deploy --only functions",
    "logs": "firebase functions:log"
  },
  "dependencies": {
    "firebase-admin": "~5.8.1",
    "firebase-functions": "^0.8.1"
  },
  "devDependencies": {
    "eslint": "^4.12.0",
    "eslint-plugin-promise": "^3.6.0"
  },
  "private": true
}

Input data

{
  "requests" : [ null, {
    "name" : "Ford Prefect"
  }, {
    "name" : "Arthur Dent"
  }, {
    "name" : "Trillian"
  }, {
    "name" : "Zaphod Beeblebrox"
  }, {
    "name" : "Marvin"
  }, {
    "name" : "Slartibartfast"
  }, {
    "name" : "Deep Thought"
  }, {
    "name" : "Zarniwoop"
  }, {
    "name" : "Fenchurch"
  }, {
    "name" : "Babel Fish"
  }, {
    "name" : "Tricia McMillan"
  } ]
}

To test, just use the GCP Console to turn down the quota limit for function invocations per 100 seconds for your Firebase project. Set it to a low value, like 4. Then, import the test data and watch the Errors roll in. Several invocations will succeed (not always 4 of them), then a batch of repeating errors will come in until the 100 second quota rolls over, then you'll get more successful results. Interestingly, Google doesn't seem to use a FIFO queue for the events, so if you enter the records manually, don't expect them to necessarily be processed in the order they were entered.

All said, under a paid plan (which I've already upgraded to), the quotas are quite high so it's unlikely I'll crash into this issue again. However, it was still good to learn about how quota errors are handled in case I need to engineer items that may hit a quota in the future.

Thanks again to the great folks at Firebase Support for all the help.

answered Sep 28 '22 05:09

HondaGuy

Related questions
                            
                                Not able to create a Firebase (Google)project after deleting the existing ones
                            
                                How to fix "Failed to resolve com.google.firebase:firebase-ads:9.2.0"?
                            
                                Firebase verification email template editing
                            
                                Uploading files from Firebase Cloud Functions to Cloud Storage
                            
                                Firebase send push notification twice
                            
                                Array.map() produces '[T]', not the expected contextual result type '[String: Any?]'
                            
                                How to send FCM Push Notifications to multiple topics
                            
                                Throwing new functions.https.HttpsError on Firebase Cloud Function rejects as Internal Error on Client
                            
                                Firebase dynamic links always opens app store
                            
                                Dynamic Links - Domain and path prefix are being released. Please retry in one month
                            
                                Firebase security rule gives permission denied?
                            
                                Hosting nodeJS app with firebase
                            
                                Firebase Deep Link short URL
                            
                                Firebase analytics events don't show value
                            
                                How to get all of device tokens from Firebase?
                            
                                Firebase Auth using phone number returns an internal error
                            
                                App Store rejection due to Firebase phone auth
                            
                                react-native-firebase crashlytics not showing up on firebase dashboard
                            
                                ERROR: Could not find method platform() for arguments - React Native - Firebase
                            
                                How to define an experiment for first time users in Firebase?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Firebase Cloud Function Repeatedly Fails Due to Quota Error

Tags:

firebase

google-cloud-functions

HondaGuy

People also ask

1 Answers

HondaGuy

Recent Activity

Donate For Us