Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

firebase admin failing to query large items

Using firebase admin to retrieve data from a collection in a cloud function fails for large items. Sample code i am using to query the selection from the cloud function is as follow

admin.database().orderByChild('mmyyyy').equalTo(month).once('value');

this call fails when i try to retrieve 10600 items (trying to figure out why). in google console there is this log but nothing else that can point me in the right direction

textPayload:  "Function execution took 18547 ms, finished with status: 'response error'"   

After many failed attempt, i decided to try to execute this call on the client using firebase sdk as follow:

result = await firebase.database().ref(`transactions`).orderByChild('mmyyyy').equalTo(month).once('value');

this works perfectly on the client without error and returning all my items 17000 of them (size of this json is 26MB).

Why is this the case? is there any limitation that is not documented?

Note: I increased my cloud function memory to 1gb and timeout to 5min, didn't help.

Here is full sample code

const admin = require('firebase-admin');
var functions = require('firebase-functions');
admin.initializeApp(functions.config().firebase);

const cors = require('cors')({
    "origin": "*",
    "methods": "POST,GET",
    "allowedHeaders": "Content-Type,uid,agence,month,course,raceType,raceNumber,status",
    "preflightContinue": false,
    "optionsSuccessStatus": 204
});

function _findTransactions(agence, month, course, raceType, raceNumber, status) {
    return new Promise((resolve, reject) => {
        try {
            let db = admin.database();

            let findPromise = db.ref(`transactions`).orderByChild('mmyyyy').equalTo(month).once('value');

            findPromise.then((result) => {

                let transactions = result.val();

                //removed business logic

                resolve(transactions);
            }).catch((err) => {
                console.log(err);
                reject(err);
            });
        } catch (error) {
            console.log(error);
            reject(error);
        }
    });
}

exports.findTransactions = functions.https.onRequest((req, res) => {
    let uid;
    try {
        cors(req, res, () => {
            uid = req.headers.uid;
            let agence = req.headers.agence;
            let month = req.headers.month;
            let course = req.headers.course;
            let raceType = req.headers.raceType;
            let raceNumber = req.headers.raceNumber;
            let status = req.headers.status;

            if (req.method !== 'GET') {
                return handleResponse(req, res, 403);
            }

            if (!uid || uid == null || uid == undefined) {
                return handleResponse(req, res, 401);
            }

            _validateUserId(uid, ['central_cashier', 'admin'])
                .then(() => {
                    _findTransactions(agence, month, course, raceType, raceNumber, status)
                        .then((result) => {
                            return handleResponse(req, res, 200, result);
                        }).catch((error) => {
                            return handleResponse(req, res, 500);
                        });
                }).catch((error) => {
                    return handleResponse(req, res, 401);
                });
        });
    } catch (error) {
        return handleError(res, uid, error);
    }
});
like image 699
user8702936 Avatar asked Oct 01 '17 07:10

user8702936


2 Answers

Your payload is too large and is exceeding the quota for Google Cloud Functions as you stated.

Two options comes to mind:

  • Compress the payload. Gzip the file before sending it to the client. This is easy with the NodeJS built in Zlib module, or;
  • Set up a virtual machine. Virtual machines are not bound to these restrictions.
like image 136
unitario Avatar answered Oct 01 '22 16:10

unitario


I did some testing and conclude that Google Cloud Functions (GCF) is enforcing some kind of timeout or "abort" action when a query results in a large number of results (ie. many Datastore entities). See my comments attached to this question for some background.

tl;dr I created my own Express.js webserver and ran my GCF code on it.

This is how I tested it: I created an ubuntu instance with http/https and the Datastore API enabled. On my instance, I installed Node, Express, and got a basic https server working (self-signed certificate worked fine since this is just testing an api backend service). Then I copy-pasted my GCF code (the function that was failing in GCF) into my minimal Express webserver. I pointed my React app to use my instance, which triggered a query that resulted in over 32,000 Datastore entities. My GCF function sends a query with datastore.runQuery() which is common.

It took about a minute, but eventually all 32,000 entities were served by Express and loaded in the React app (browser) with no errors.

A basic Express route calls my GCF function:

app.post('/foo', (req, res) => {
  myCloudFunction(req, res);
})

const myCloudFunction = (req, res) => {
  // Inspects req, queries Datastore, and returns the results.
};

For this test, my React app just points to https://mydomain.example.com:3000/foo (because my Express server listens on port 3000).

So it seems that GCF is not good enough for my application, unless I add pagination to the app (which is on the roadmap).

like image 44
Donn Lee Avatar answered Oct 01 '22 14:10

Donn Lee