Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading data from a CSV file online in Python 3

Tags:

python

csv

Just testing something out as practice. I have this huge CSV file online https://raw.github.com/datasets/gdp/master/data/gdp.csv And I want to read all the data and put it into a table so I can do analyse it and make tables. The code I have so far was put together using other StackOverflow questions and other websites but what seems to happen is when it's read, and then immediately printed out again it's letter by letter so I get:

['C']
['o']
['u']
['n']
['t']
['r']
['y']
[' ']
['N']
['a']
['m']
['e']
['', '']
['C']
['o']
['u']
['n']
['t']
['r']
['y']
[' ']
['C']
['o']
['d']
['e']
['', '']
['Y']
['e']
['a']
['r']
['', '']
['V']
['a']
['l']
['u']
['e']
[]
[]
['A']
['r']
['a']
['b']
[' ']
['W']
['o']
['r']
['l']
['d']
['', '']

my code is this so far:

import csv
import urllib.request

url = "https://raw.github.com/datasets/gdp/master/data/gdp.csv"
webpage = urllib.request.urlopen(url)
datareader = csv.reader(webpage.read().decode('utf-8'))
data = []
for row in datareader:
    data.append(row)

for row in data:
    print(row)

How can I change it so that it actually reads line by line and then even splits the line up into different variables. I did this before using

payRollNumber, salary, jobTitle, otherNames, \
               surname = line.strip().split(',')

And I can apply this after I've got the rows. Any ideas?

like image 638
DonnellyOverflow Avatar asked Jan 25 '14 14:01

DonnellyOverflow


People also ask

How do I read a CSV file in Python online?

csv file in reading mode using open() function. Then, the csv. reader() is used to read the file, which returns an iterable reader object. The reader object is then iterated using a for loop to print the contents of each row.


1 Answers

You need to split the read CSV data by lines before passing it to the csv.reader():

datareader = csv.reader(webpage.read().decode('utf-8').splitlines())

The csv.reader() then takes care of the rest for you.

You could also have io.TextIOWrapper() take care of reading, decoding and line-handling for you:

import csv
import io
import urllib.request

url = "https://raw.github.com/datasets/gdp/master/data/gdp.csv"
webpage = urllib.request.urlopen(url)
datareader = csv.reader(io.TextIOWrapper(webpage))

There is little point in looping over the reader and adding rows to a list; you could just do:

data = list(datareader)

instead, but if all you want to do is print out the columns, loop directly over the reader and do so:

datareader = csv.reader(io.TextIOWrapper(webpage))
for row in datareader:
    print(row)

Either way, with splitting the lines yourself or using TextIOWrapper, the code now produces:

['Country Name', 'Country Code', 'Year', 'Value']
['Arab World', 'ARB', '1968', '32456179321.45']
['Arab World', 'ARB', '1969', '35797666653.6002']
['Arab World', 'ARB', '1970', '39062044200.4362']
['Arab World', 'ARB', '1971', '45271917893.3429']
['Arab World', 'ARB', '1972', '54936622019.8224']
['Arab World', 'ARB', '1973', '69564884441.8264']
['Arab World', 'ARB', '1974', '132123836511.468']
['Arab World', 'ARB', '1975', '147666389454.913']
['Arab World', 'ARB', '1976', '182208407088.856']
# ... etc. ...
like image 88
Martijn Pieters Avatar answered Sep 20 '22 20:09

Martijn Pieters