Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Download file from Kaggle to Google Colab

I want to download the sign language dataset from Kaggle to my Colab.

So far I always used wget and the specific zip file link, for example:

!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/rps.zip \
    -O /tmp/rps.zip

However, when I right-click the download button at Kaggle and select copy link to get the path copied to my clipboard and I output it I get:

https://www.kaggle.com/datamunge/sign-language-mnist/download

When I use this link in my browser I am asked to download it. I can see that the filename is 3258_5337_bundle_archive.zip

So I tried:

!wget --no-check-certificate \
        https://www.kaggle.com/datamunge/sign-language-mnist/download3258_5337_bundle_archive.zip  \
        -O /tmp/kds.zip

and also tried:

 !wget --no-check-certificate \
            https://www.kaggle.com/datamunge/sign-language-mnist/download3258_5337_bundle_archive.zip  \
            -O /tmp/kds.zip

I get as output:

exa

So it does not work. File coudln't be found or the returned zip archive is not 101mb large, but just a few kb. Also when trying to unzip it, it does not work.

How can I download this file into my colab (directly with wget?)?

like image 580
Stat Tistician Avatar asked Jul 01 '20 08:07

Stat Tistician


2 Answers

Kaggle recommends using their own API instead of wget or rsync.

First, make an API token for Kaggle. On Kaggle's website go to "My Account", Scroll to API section and click on "Create New API Token" - It will download kaggle.json file on your machine.

Then run the following in Google Colab:

from google.colab import files
files.upload() # Browse for the kaggle.json file that you downloaded

# Make directory named kaggle, copy kaggle.json file there, and change the permissions of the file.
! mkdir ~/.kaggle
! cp kaggle.json ~/.kaggle/
! chmod 600 ~/.kaggle/kaggle.json

# You can check if everything's okay by running this command.
! kaggle datasets list

# Download and unzip sign-language-mnist dataset into '/usr/local'
! kaggle datasets download -d datamunge/sign-language-mnist --path '/usr/local' --unzip

Used info from here: https://www.kaggle.com/general/74235

like image 65
rchurt Avatar answered Oct 21 '22 05:10

rchurt


This is the simplest way I came up to do it (if you participate in competition just change datasets to competitions):

import os

os.environ['KAGGLE_USERNAME'] = "xxxx"

os.environ['KAGGLE_KEY'] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

!kaggle datasets download -d iarunava/happy-house-dataset
like image 33
Seb.code Avatar answered Oct 21 '22 05:10

Seb.code