Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How to read CSV file from GitHub using pandas





Im trying to read CSV file thats on github with Python using pandas> i have looked all over the web, and I tried some solution that I found on this website, but they do not work. What am I doing wrong?

I have tried this:

import pandas as pd

url = 'https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/blob/master/all/all.csv'
df = pd.read_csv(url,index_col=0)
#df = pd.read_csv(url)

like image 372
taga Avatar asked Mar 19 '19 11:03


3 Answers

You should provide URL to raw content. Try using this:

import pandas as pd

url = 'https://raw.githubusercontent.com/lukes/ISO-3166-Countries-with-Regional-Codes/master/all/all.csv'
df = pd.read_csv(url, index_col=0)


               alpha-2           ...            intermediate-region-code
name                             ...                                    
Afghanistan         AF           ...                                 NaN
Åland Islands       AX           ...                                 NaN
Albania             AL           ...                                 NaN
Algeria             DZ           ...                                 NaN
American Samoa      AS           ...                                 NaN
like image 151
Alderven Avatar answered Oct 19 '22 05:10


Add ?raw=true at the end of the GitHub URL to get the raw file link.

In your case,

import pandas as pd

url = 'https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/blob/master/all/all.csv?raw=true'
df = pd.read_csv(url,index_col=0)


               alpha-2 alpha-3  country-code     iso_3166-2   region  \
Afghanistan         AF     AFG             4  ISO 3166-2:AF     Asia   
Åland Islands       AX     ALA           248  ISO 3166-2:AX   Europe   
Albania             AL     ALB             8  ISO 3166-2:AL   Europe   
Algeria             DZ     DZA            12  ISO 3166-2:DZ   Africa   
American Samoa      AS     ASM            16  ISO 3166-2:AS  Oceania   

                     sub-region intermediate-region  region-code  \
Afghanistan       Southern Asia                 NaN        142.0   
Åland Islands   Northern Europe                 NaN        150.0   
Albania         Southern Europe                 NaN        150.0   
Algeria         Northern Africa                 NaN          2.0   
American Samoa        Polynesia                 NaN          9.0   

                sub-region-code  intermediate-region-code  
Afghanistan                34.0                       NaN  
Åland Islands             154.0                       NaN  
Albania                    39.0                       NaN  
Algeria                    15.0                       NaN  
American Samoa             61.0                       NaN 

Note: This works only with GitHub links and not with GitLab or Bitbucket links.

like image 17
Krishnakanth Allika Avatar answered Oct 19 '22 05:10

Krishnakanth Allika

You can copy/paste the url and change 2 things:

  1. Remove "blob"
  2. Replace github.com by raw.githubusercontent.com

For instance this link:


Works this way:

import pandas as pd

like image 3
Nicolas Gervais Avatar answered Oct 19 '22 07:10

Nicolas Gervais