I have a zip file containing a relatively large dataset (1Gb) stored in a zip file in Google Cloud Storage instance.
I need to use Notebook hosted in Google Cloud Datalab to access that file and the data contained there. How do I go about this?
Thank you.
Can you try the following?
import pandas as pd
# Path to the object in Google Cloud Storage that you want to copy
sample_gcs_object = 'gs://path-to-gcs/Hello.txt.zip'
# Copy the file from Google Cloud Storage to Datalab
!gsutil cp $sample_gcs_object 'Hello.txt.zip'
# Unzip the file
!unzip 'Hello.txt.zip' 
# Read the file into a pandas DataFrame
pandas_dataframe = pd.read_csv('Hello.txt')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With