Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract csv file specific columns to list in Python

What I'm trying to do is plot the latitude and longitude values of specific storms on a map using matplotlib,basemap,python, etc. My problem is that I'm trying to extract the latitude, longitude, and name of the storms on map but I keep getting errors between lines 41-44 where I try to extract the columns into the list. Could someone please help me figure this out. Thanks in advance.

Here is what the file looks like:

1957,AUDREY,HU, 21.6N, 93.3W 1957,AUDREY,HU,22.0N,  93.4W 1957,AUDREY,HU,22.6N,  93.5W 1957,AUDREY,HU,23.2N,  93.6W 

I want the list to look like the following:

latitude = [21.6N,22.0N,23.4N] longitude = [93.3W, 93.5W,93.8W] name = ["Audrey","Audrey"] 

Here's what I have so far:

data = np.loadtxt('louisianastormb.csv',dtype=np.str,delimiter=',',skiprows=1) '''print data'''  data = np.loadtxt('louisianastormb.csv',dtype=np.str,delimiter=',',skiprows=0)  f= open('louisianastormb.csv', 'rb') reader = csv.reader(f, delimiter=',') header = reader.next() zipped = zip(*reader)  latitude = zipped[3] longitude = zipped[4] names = zipped[1] x, y = m(longitude, latitude) 

Here's the last error message/traceback I received:

Traceback (most recent call last):
File "/home/darealmzd/lstorms.py", line 42, in

header = reader.next()
_csv.Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

like image 503
mikez1 Avatar asked Oct 21 '13 04:10

mikez1


People also ask

How do I read a specific column in a DataFrame in Python?

You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.


2 Answers

This looks like a problem with line endings in your code. If you're going to be using all these other scientific packages, you may as well use Pandas for the CSV reading part, which is both more robust and more useful than just the csv module:

import pandas colnames = ['year', 'name', 'city', 'latitude', 'longitude'] data = pandas.read_csv('test.csv', names=colnames) 

If you want your lists as in the question, you can now do:

names = data.name.tolist() latitude = data.latitude.tolist() longitude = data.longitude.tolist() 
like image 195
chthonicdaemon Avatar answered Sep 24 '22 05:09

chthonicdaemon


A standard-lib version (no pandas)

This assumes that the first row of the csv is the headers

import csv  # open the file in universal line ending mode  with open('test.csv', 'rU') as infile:   # read the file as a dictionary for each row ({header : value})   reader = csv.DictReader(infile)   data = {}   for row in reader:     for header, value in row.items():       try:         data[header].append(value)       except KeyError:         data[header] = [value]  # extract the variables you want names = data['name'] latitude = data['latitude'] longitude = data['longitude'] 
like image 22
Ben Southgate Avatar answered Sep 21 '22 05:09

Ben Southgate