Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to upload and save large data to Google Colaboratory from local drive?

I have downloaded large image training data as zip from this Kaggle link

https://www.kaggle.com/c/yelp-restaurant-photo-classification/data

How do I efficiently achieve the following?

  1. Create a project folder in Google Colaboratory
  2. Upload zip file to project folder
  3. unzip the files

Thanks

EDIT: I tried the below code but its crashing for my large zip file. Is there a better/efficient way to do this where I can just specify the location of the file in local drive?

from google.colab import files
uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))
like image 461
GeorgeOfTheRF Avatar asked Feb 19 '18 06:02

GeorgeOfTheRF


People also ask

How do I load data from local drive to Colab?

2) From a local driveClick on “Choose Files” then select and upload the file. Wait for the file to be 100% uploaded. You should see the name of the file once Colab has uploaded it. Finally, type in the following code to import it into a dataframe (make sure the filename matches the name of the uploaded file).

Can Google colab handle large datasets?

Downloading the datasets from API calls: json'. You have to upload this file to your colab notebook. You can use the code given below to download and unzip the datasets. You can now get access to the datasets of size ~1.2 GB in most efficient way.


1 Answers

!pip install kaggle
api_token = {"username":"USERNAME","key":"API_KEY"}
import json
import zipfile
import os
with open('/content/.kaggle/kaggle.json', 'w') as file:
    json.dump(api_token, file)
!chmod 600 /content/.kaggle/kaggle.json
!kaggle config set -n path -v /content
!kaggle competitions download -c jigsaw-toxic-comment-classification-challenge
os.chdir('/content/competitions/jigsaw-toxic-comment-classification-challenge')
for file in os.listdir():
    zip_ref = zipfile.ZipFile(file, 'r')
    zip_ref.extractall()
    zip_ref.close()

There is minor change on line 9, without which was encountering error. source: https://gist.github.com/jayspeidell/d10b84b8d3da52df723beacc5b15cb27 couldn't add as comment cause rep.

like image 55
Vikas Avatar answered Sep 27 '22 20:09

Vikas