I am trying to understand how to write a multiple line csv file to google cloud storage. I'm just not following the documentation
Close to here: Unable to read csv file uploaded on google cloud storage bucket
Example:
from google.cloud import storage
from oauth2client.client import GoogleCredentials
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "<pathtomycredentials>"
a=[1,2,3]
b=['a','b','c']
storage_client = storage.Client()
bucket = storage_client.get_bucket("<mybucketname>")
blob=bucket.blob("Hummingbirds/trainingdata.csv")
for eachrow in range(3):
blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]))
That gets you a single line on google cloud storage
3,c
clearly it opened a new file each time and wrote the line.
Okay, how about adding a new line delim?
for eachrow in range(3):
blob.upload_from_string(str(a[eachrow]) + "," + str(b[eachrow]) + "\n")
that adds the line break, but again writes from the beginning.
Can someone illustrate what the approach is? I could combine all my lines into one string, or write a temp file, but that seems very ugly.
Perhaps with open as file?
Select All Settings > Raw Data Export > CSV Upload. Select Google Cloud Storage from the dropdown menu. Upload your Service Account Key credential file. This is the JSON file created in the Google Cloud Console. Enter your Google Cloud Storage bucket name.
Google Drive HomescreenThen click on New. Click File upload. Then choose the CSV you want to upload. The CSV file will upload to your Google Drive.
In the Google Cloud console, go to the Cloud Storage Buckets page. In the list of buckets, click on the name of the bucket that you want to upload an object to. In the Objects tab for the bucket, either: Drag and drop the desired files from your desktop or file manager to the main pane in the Google Cloud console.
The easiest way to upload a CSV file is from your GitHub repository. Click on the dataset in your repository, then click on View Raw. Copy the link to the raw dataset and store it as a string variable called url in Colab as shown below (a cleaner method but it's not necessary).
Please refer to below answer, hope it helps.
import pandas as pd
data = [['Alex','Feb',10],['Bob','jan',12]]
df = pd.DataFrame(data,columns=['Name','Month','Age'])
print df
Output
Name Month Age
0 Alex Feb 10
1 Bob jan 12
Add a row
row = ['Sally','Oct',15]
df.loc[len(df)] = row
print df
output
Name Month Age
0 Alex Feb 10
1 Bob jan 12
2 Sally Oct 15
write/copy to GCP Bucket using gsutil
df.to_csv('text.csv', index = False)
!gsutil cp 'text.csv' 'gs://BucketName/folderName/'
Python code (docs https://googleapis.dev/python/storage/latest/index.html )
from google.cloud import storage
def upload_to_bucket(bucket_name, blob_path, local_path):
bucket = storage.Client().bucket(bucket_name)
blob = bucket.blob(blob_path)
blob.upload_from_filename(local_path)
return blob.url
# method call
bucket_name = 'bucket-name' # do not give gs:// ,just bucket name
blob_path = 'path/folder name inside bucket'
local_path = 'local_machine_path_where_file_resides' #local file path
upload_to_bucket(bucket_name, blob_path, local_path)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With