Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Colab: how to read data from my google drive?

People also ask

How do you import files from Google Drive to Colab?

Once you have completed verification, go to the CSV file in Google Drive, right-click on it and select “Get shareable link”. The link will be copied into your clipboard. Paste this link into a string variable in Colab.


Edit: As of February, 2020, there's now a first-class UI for automatically mounting Drive.

First, open the file browser on the left hand side. It will show a 'Mount Drive' button. Once clicked, you'll see a permissions prompt to mount Drive, and afterwards your Drive files will be present with no setup when you return to the notebook. The completed flow looks like so:

Drive auto mount example

The original answer follows, below. (This will also still work for shared notebooks.)

You can mount your Google Drive files by running the following code snippet:

from google.colab import drive
drive.mount('/content/drive')

Then, you can interact with your Drive files in the file browser side panel or using command-line utilities.

Here's an example notebook


Good news, PyDrive has first class support on CoLab! PyDrive is a wrapper for the Google Drive python client. Here is an example on how you would download ALL files from a folder, similar to using glob + *:

!pip install -U -q PyDrive
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# choose a local (colab) directory to store the data.
local_download_path = os.path.expanduser('~/data')
try:
  os.makedirs(local_download_path)
except: pass

# 2. Auto-iterate using the query syntax
#    https://developers.google.com/drive/v2/web/search-parameters
file_list = drive.ListFile(
    {'q': "'1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk' in parents"}).GetList()

for f in file_list:
  # 3. Create & download by id.
  print('title: %s, id: %s' % (f['title'], f['id']))
  fname = os.path.join(local_download_path, f['title'])
  print('downloading to {}'.format(fname))
  f_ = drive.CreateFile({'id': f['id']})
  f_.GetContentFile(fname)


with open(fname, 'r') as f:
  print(f.read())

Notice that the arguments to drive.ListFile is a dictionary that coincides with the parameters used by Google Drive HTTP API (you can customize the q parameter to be tuned to your use-case).

Know that in all cases, files/folders are encoded by id's (peep the 1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk) on Google Drive. This requires that you search Google Drive for the specific id corresponding to the folder you want to root your search in.

For example, navigate to the folder "/projects/my_project/my_data" that is located in your Google Drive.

Google Drive

See that it contains some files, in which we want to download to CoLab. To get the id of the folder in order to use it by PyDrive, look at the url and extract the id parameter. In this case, the url corresponding to the folder was:

https://drive.google.com/drive/folders/1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk

Where the id is the last piece of the url: 1SooKSw8M4ACbznKjnNrYvJ5wxuqJ-YCk.


Thanks for the great answers! Fastest way to get a few one-off files to Colab from Google drive: Load the Drive helper and mount

from google.colab import drive

This will prompt for authorization.

drive.mount('/content/drive')

Open the link in a new tab-> you will get a code - copy that back into the prompt you now have access to google drive check:

!ls "/content/drive/My Drive"

then copy file(s) as needed:

!cp "/content/drive/My Drive/xy.py" "xy.py"

confirm that files were copied:

!ls

What I have done is first:

from google.colab import drive
drive.mount('/content/drive/')

Then

%cd /content/drive/My Drive/Colab Notebooks/

After I can for example read csv files with

df = pd.read_csv("data_example.csv")

If you have different locations for the files just add the correct path after My Drive


Most of the previous answers are a bit(Very) complicated,

from google.colab import drive
drive.mount("/content/drive", force_remount=True)

I figured out this to be the easiest and fastest way to mount google drive into CO Lab, You can change the mount directory location to what ever you want by just changing the parameter for drive.mount. It will give you a link to accept the permissions with your account and then you have to copy paste the key generated and then drive will be mounted in the selected path.

force_remount is used only when you have to mount the drive irrespective of whether its loaded previously.You can neglect this when parameter if you don't want to force mount

Edit: Check this out to find more ways of doing the IO operations in colab https://colab.research.google.com/notebooks/io.ipynb