Pandas read_csv from url

People also ask

Can pandas read URL?

Download Data Directly to Pandas DataFrameOnce you have found the remote URL path it's simple to read the data into a Pandas DataFrame.

How do I read a CSV file from a website in Python?

Use the pandas. read_csv() Function to Download a CSV File From a URL in Python. The read_csv() function from the Pandas module can read CSV files from different sources and store the result in a Pandas DataFrame.

In the latest version of pandas (0.19.2) you can directly pass the url

import pandas as pd

url="https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv"
c=pd.read_csv(url)

UPDATE: From pandas 0.19.2 you can now just pass read_csv() the url directly, although that will fail if it requires authentication.

For older pandas versions, or if you need authentication, or for any other HTTP-fault-tolerant reason:

Use pandas.read_csv with a file-like object as the first argument.

If you want to read the csv from a string, you can use io.StringIO.
For the URL https://github.com/cs109/2014_data/blob/master/countries.csv, you get html response, not raw csv; you should use the url given by the Raw link in the github page for getting raw csv response , which is https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv

Example:

import pandas as pd
import io
import requests
url="https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv"
s=requests.get(url).content
c=pd.read_csv(io.StringIO(s.decode('utf-8')))

Notes:

in Python 2.x, the string-buffer object was StringIO.StringIO

As I commented you need to use a StringIO object and decode i.e c=pd.read_csv(io.StringIO(s.decode("utf-8"))) if using requests, you need to decode as .content returns bytes if you used .text you would just need to pass s as is s = requests.get(url).text c = pd.read_csv(StringIO(s)).

A simpler approach is to pass the correct url of the raw data directly to read_csv, you don't have to pass a file like object, you can pass a url so you don't need requests at all:

c = pd.read_csv("https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv")

print(c)

Output:

                              Country         Region
0                             Algeria         AFRICA
1                              Angola         AFRICA
2                               Benin         AFRICA
3                            Botswana         AFRICA
4                             Burkina         AFRICA
5                             Burundi         AFRICA
6                            Cameroon         AFRICA
..................................

From the docs:

filepath_or_buffer :

string or file handle / StringIO The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. For instance, a local file could be file ://localhost/path/to/table.csv

The problem you're having is that the output you get into the variable 's' is not a csv, but a html file. In order to get the raw csv, you have to modify the url to:

'https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv'

Your second problem is that read_csv expects a file name, we can solve this by using StringIO from io module. Third problem is that request.get(url).content delivers a byte stream, we can solve this using the request.get(url).text instead.

End result is this code:

from io import StringIO

import pandas as pd
import requests
url='https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv'
s=requests.get(url).text

c=pd.read_csv(StringIO(s))

output:

>>> c.head()
    Country  Region
0   Algeria  AFRICA
1    Angola  AFRICA
2     Benin  AFRICA
3  Botswana  AFRICA
4   Burkina  AFRICA

url = "https://github.com/cs109/2014_data/blob/master/countries.csv"
c = pd.read_csv(url, sep = "\t")

To Import Data through URL in pandas just apply the simple below code it works actually better.

import pandas as pd
train = pd.read_table("https://urlandfile.com/dataset.csv")
train.head()

If you are having issues with a raw data then just put 'r' before URL

import pandas as pd
train = pd.read_table(r"https://urlandfile.com/dataset.csv")
train.head()

Related questions
                            
                                random.seed(): What does it do?
                            
                                What does %s mean in a python format string?
                            
                                Sound alarm when code finishes
                            
                                How do I turn a python datetime into a string, with readable format date?
                            
                                Python glob multiple filetypes
                            
                                Merge PDF files
                            
                                How to merge a transparent png image with another image using PIL
                            
                                Why can I not create a wheel in python?
                            
                                Splitting a list into N parts of approximately equal length
                            
                                Specifying and saving a figure with exact size in pixels
                            
                                Convert a python 'type' object to a string
                            
                                Don't understand why UnboundLocalError occurs (closure) [duplicate]
                            
                                How to suppress scientific notation when printing float values?
                            
                                How to write bytes to file?
                            
                                Python - json without whitespaces
                            
                                How to change the name of a Django app?
                            
                                Hash Map in Python
                            
                                Keras, How to get the output of each layer?
                            
                                Converting XML to JSON using Python?
                            
                                How to do a recursive sub-folder search and return files in a list?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas read_csv from url

Tags:

python

pandas

csv

request

People also ask

To Import Data through URL in pandas just apply the simple below code it works actually better.

If you are having issues with a raw data then just put 'r' before URL

Recent Activity

Donate For Us