Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

KeyError when using DictReader()

Tags:

python

csv

I have a series of .src files that I am trying to input into a dictionary using DictReader(). The files look like the following (just the header and the first row):

SRC V2.0.. ........Time Id Event T Conf .Northing ..Easting ...Depth Velocity .NN_Err .EE_Err .DD_Err .NE_Err .ND_Err .ED_Err Ns Nu uSt ....uMag Nt tSt ....tMag .MomMag SeiMoment ...Energy ...Es/Ep .SourceRo AspRadius .StaticSD AppStress DyStressD MaxDispla PeakVelPa PeakAccPa PSt
07-30-2010 07:43:56.543 ND     0 e 0.00    152.54    746.45  1686.31     6000   11.76   11.76   11.76    0.00    0.00    0.00 30  0 num    -9.90 30 utm    -3.21   -1.12 2.06e+007 2.22e+000 20.93    6.08e+000 0.00e+000 3.83e+004 1.49e+003 0.00e+000 1.52e-005 1.50e-003 0.00e+000   1

Anyways, the following is my code:

import csv

Time = {}
Northing = {}
source_file = open(NNSRC, 'rb')
for line in csv.DictReader(source_file, delimiter = '\t'):
    Time = line['........Time'].strip()
    Northing = line['.Northing'].strip()

print Time, Northing

It gives me the following error:

Traceback (most recent call last):
  File "C:\Python26\Lib\site-packages\xy\NNFindStages.py", line 101, in <module>
    Time = line['........Time'].strip()
KeyError: '........Time'

How can I account for the strange way the header is formatted in the file without changing the file itself?

Any help is greatly appreciated!

like image 983
user1620716 Avatar asked Sep 21 '12 16:09

user1620716


1 Answers

Your header line is not using tabs.

When I recreate your data without tabs, the line returned by the csv module contains just one (long) key. If I recreate it with actual tabs, then I get:

>>> source_file = open('out.csv', 'rb')
>>> reader = csv.DictReader(source_file, delimiter = '\t')
>>> line = reader.next()
>>> len(line)
37
>>> line.keys()
['Id', '..Easting', '.NE_Err', 'uSt', 'SeiMoment', 'MaxDispla', 'tSt', 'Ns', 'Nt', 'Nu', '.Northing', '.DD_Err', '...Energy', '....uMag', 'V2.0..', 'DyStressD', 'SRC', 'PeakAccPa', '.SourceRo', '........Time', '.EE_Err', 'T', 'Velocity', 'PeakVelPa', 'AspRadius', '...Depth', 'PSt', '....tMag', '.MomMag', 'AppStress', '...Es/Ep', '.ED_Err', 'Event', '.ND_Err', 'Conf', '.StaticSD', '.NN_Err']
>>> line['........Time']
'ND'
>>> line['.Northing']
'746.45'

Note that the values do not need stripping; the module takes care of extraneous whitespace for you.

You can read your header separately, clean that up, then deal with the rest of your data with the csv module:

source_file = open(NNSRC, 'rb')
header = source_file.readline()
source_file.seek(len(header))  # reset read buffer

headers = [h.strip('.') for h in header.split()]
headers = ['Date'] + headers[2:]  # Replace ['SRC', 'V2.0'] with a Date field instead
for line in csv.DictReader(source_file, fieldnames=headers, delimiter = '\t'):
    # process line

The above code reads the header line separately, splits it and removes the extra . periods for you to make for more workable column keys, then sets the file up for the DictReader by resetting the readline buffer (a side-effect of the .seek() call).

like image 118
Martijn Pieters Avatar answered Oct 29 '22 00:10

Martijn Pieters