Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: How to read CSV file from google drive public?

Tags:

python

pandas

I searched similar questions about reading csv from URL but I could not find a way to read csv file from google drive csv file.

My attempt:

import pandas as pd

url = 'https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
dfs = pd.read_html(url)

How can we read this file in pandas?

Related links:

  • Pandas read_csv from url
  • https://pandas.pydata.org/pandas-docs/version/0.22/io.html#io-read-html
like image 252
BhishanPoudel Avatar asked Jun 15 '19 15:06

BhishanPoudel


People also ask

How do I read Google Drive files in pandas?

All you need to do is create a f-string for the url which takes the sheet id and sheet name and formats them into a url pandas can read. You can find the sheet id in the url of your file behind “d/”, copy it from your browser and paste it into your code.

How will you access the dataset of a publicly shared spreadsheet in CSV format stored in Google Drive?

Set up the data source (Google Drive) If you want to export CSV data from a publicly stored CSV, you'll need to choose CSV as a source app. Read more about the CSV importer and how to import CSV to Google Sheets. Click “Connect”, select and sign in to your Google account. Click “Continue“.


Video Answer


5 Answers

Using pandas

import pandas as pd

url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
file_id=url.split('/')[-2]
dwn_url='https://drive.google.com/uc?id=' + file_id
df = pd.read_csv(dwn_url)
print(df.head())

Using pandas and requests

import pandas as pd
import requests
from io import StringIO

url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'

file_id = url.split('/')[-2]
dwn_url='https://drive.google.com/uc?export=download&id=' + file_id
url2 = requests.get(dwn_url).text
csv_raw = StringIO(url2)
df = pd.read_csv(csv_raw)
print(df.head())

output

      sex   age state  cheq_balance  savings_balance  credit_score  special_offer
0  Female  10.0    FL       7342.26          5482.87           774           True
1  Female  14.0    CA        870.39         11823.74           770           True
2    Male   0.0    TX       3282.34          8564.79           605           True
3  Female  37.0    TX       4645.99         12826.76           608           True
4    Male   NaN    FL           NaN          3493.08           551          False
like image 193
BhishanPoudel Avatar answered Oct 18 '22 04:10

BhishanPoudel


To read CSV file from google drive you can do that.

import pandas as pd

url = 'https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
path = 'https://drive.google.com/uc?export=download&id='+url.split('/')[-2]
df = pd.read_csv(path)

I think this is the easiest way to read CSV files from google drive. hope your "Anyone with the link" option enables in google drive.

like image 27
Samir Mughal Avatar answered Oct 18 '22 03:10

Samir Mughal


I would recommend you using the following code:

import pandas as pd
import requests
from io import StringIO

url = requests.get('https://doc-0g-78-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/5otus4mg51j69f99n47jgs0t374r46u3/1560607200000/09837260612050622056/*/0B6GhBwm5vaB2ekdlZW5WZnppb28?e=download')
csv_raw = StringIO(url.text)
dfs = pd.read_csv(csv_raw)

hope this helps

like image 5
Nazim Kerimbekov Avatar answered Oct 18 '22 02:10

Nazim Kerimbekov


Simply change de URL from Google Drive using uc?id=, and then pass it to the read_csv function. In this example:

url = 'https://drive.google.com/uc?id=0B6GhBwm5vaB2ekdlZW5WZnppb28'
dfs = pd.read_csv(url)
like image 4
rusiano Avatar answered Oct 18 '22 02:10

rusiano


The other answers are great for reading a publicly accessible file but, if trying to read a private file that has been shared with an email account, you may want to consider using PyDrive.

There are many ways to authenticate (OAuth, using a GCP service account, etc). Once authenticated, reading a CSV can be as simple as getting the file ID and fetching its contents:

from io import StringIO

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive

# Assuming authentication has been performed and stored in a variable called gauth
drive = GoogleDrive(gauth)
params = {
    'q': f"id='{file_id}' = id and mimeType='text/csv'"
}
# List all files that satisfy the query
file_list = drive.ListFile(params).GetList()

gdrive_csv_file = file_list[0]
input_csv = StringIO(gdrive_csv_file.GetContentString())
    
df = pd.read_csv(input_csv)
like image 2
arredond Avatar answered Oct 18 '22 02:10

arredond