Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to store Dataframe data to Firebase Storage?

Given a pandas Dataframe which contains some data, what is the best to store this data to Firebase?

Should I convert the Dataframe to a local file (e.g. .csv, .txt) and then upload it on Firebase Storage, or is it also possible to directly store the pandas Dataframe without conversion? Or are there better best practices?

Update 01/03 - So far I've come with this solution, which requires writing a csv file locally, then reading it in and uploading it and then deleting the local file. I doubt however that this is the most efficient method, thus I would like to know if it can be done better and quicker?

import os
import firebase_admin
from firebase_admin import db, storage

cred   = firebase_admin.credentials.Certificate(cert_json)
app    = firebase_admin.initialize_app(cred, config)
bucket = storage.bucket(app=app)

def upload_df(df, data_id):
    """
    Upload a Dataframe as a csv to Firebase Storage
    :return: storage_ref
    """

    # Storage location + extension
    storage_ref = data_id + ".csv"

    # Store locally
    df.to_csv(data_id)

    # Upload to Firebase Storage
    blob    = bucket.blob(storage_ref)
    with open(data_id,'rb') as local_file:
        blob.upload_from_file(local_file)

    # Delete locally
    os.remove(data_id)

    return storage_ref
like image 911
JohnAndrews Avatar asked Dec 21 '18 14:12

JohnAndrews


2 Answers

With python-firebase and to_dict:

postdata = my_df.to_dict()

# Assumes any auth/headers you need are already taken care of.
result = firebase.post('/my_endpoint', postdata, {'print': 'pretty'})
print(result)
# Snapshot info

You can get the data back using the snapshot info and endpoint, and reestablish the df with from_dict(). You could adapt this solution to SQL and JSON solutions, which pandas also has support for.

Alternatively and depending on where you script executes from, you might consider treating firebase as a db and using the dbapi from firebase_admin (check this out.)

As for whether it's according to best practice, it's difficult to say without knowing anything about your use case.

like image 80
Charles Landau Avatar answered Oct 04 '22 18:10

Charles Landau


if you just want to reduce code length and the steps of creating and deleting files, you can use upload_from_string:

import firebase_admin
from firebase_admin import db, storage

cred   = firebase_admin.credentials.Certificate(cert_json)
app    = firebase_admin.initialize_app(cred, config)
bucket = storage.bucket(app=app)

def upload_df(df, data_id):
    """
    Upload a Dataframe as a csv to Firebase Storage
    :return: storage_ref
    """
    storage_ref = data_id + '.csv'
    blob = bucket.blob(storage_ref)
    blob.upload_from_string(df.to_csv())

    return storage_ref

https://googleapis.github.io/google-cloud-python/latest/storage/blobs.html#google.cloud.storage.blob.Blob.upload_from_string

like image 24
fcsr Avatar answered Oct 04 '22 18:10

fcsr