Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Write a Pandas DataFrame to Google Cloud Storage or BigQuery

Hello and thanks for your time and consideration. I am developing a Jupyter Notebook in the Google Cloud Platform / Datalab. I have created a Pandas DataFrame and would like to write this DataFrame to both Google Cloud Storage(GCS) and/or BigQuery. I have a bucket in GCS and have, via the following code, created the following objects:

import gcp import gcp.storage as storage project = gcp.Context.default().project_id     bucket_name = 'steve-temp'            bucket_path  = bucket_name    bucket = storage.Bucket(bucket_path) bucket.exists()   

I have tried various approaches based on Google Datalab documentation but continue to fail. Thanks

like image 922
EcoWarrior Avatar asked Mar 30 '16 16:03

EcoWarrior


1 Answers

Uploading to Google Cloud Storage without writing a temporary file and only using the standard GCS module

from google.cloud import storage import os import pandas as pd  # Only need this if you're running this code locally. os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r'/your_GCP_creds/credentials.json'  df = pd.DataFrame(data=[{1,2,3},{4,5,6}],columns=['a','b','c'])  client = storage.Client() bucket = client.get_bucket('my-bucket-name')      bucket.blob('upload_test/test.csv').upload_from_string(df.to_csv(), 'text/csv') 
like image 188
Theo Avatar answered Sep 20 '22 09:09

Theo