Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do i download a large collection in Firestore with Python without getting at 503 error?

Trying to count the number of docs in a firestore collection with python. When i use db.collection('xxxx").stream() i get the following error:

 503 The datastore operation timed out, or the data was temporarily unavailable.

about half way through. It was working fine. Here is the code:

    docs = db.collection(u'theDatabase').stream()
    count = 0
    for doc in docs:
        count += 1
    print (count)

Every time I get a 503 error at about 73,000 records. Does anyone know how to overcome the 20 second timeout?

like image 699
Michael Vertefeuille Avatar asked May 06 '19 19:05

Michael Vertefeuille


1 Answers

Although Juan's answer works for basic counting, in case you need more of the data from Firebase and not just the id (a common use case of which is total migration of the data that is not through GCP), the recursive algorithm will eat your memory.

So I took Juan's code and transformed it to a standard iterative algorithm. Hope this helps someone.

limit = 1000  # Reduce this if it uses too much of your RAM
def stream_collection_loop(collection, count, cursor=None):
    while True:
        docs = []  # Very important. This frees the memory incurred in the recursion algorithm.

        if cursor:
            docs = [snapshot for snapshot in
                    collection.limit(limit).order_by('__name__').start_after(cursor).stream()]
        else:
            docs = [snapshot for snapshot in collection.limit(limit).order_by('__name__').stream()]

        for doc in docs:
            print(doc.id)
            print(count)
            # The `doc` here is already a `DocumentSnapshot` so you can already call `to_dict` on it to get the whole document.
            process_data_and_log_errors_if_any(doc)
            count = count + 1

        if len(docs) == limit:
            cursor = docs[limit-1]
            continue

        break


stream_collection_loop(db_v3.collection('collection'), 0)
like image 173
Alec Gerona Avatar answered Sep 22 '22 13:09

Alec Gerona