Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python how to trim trailing spaces in csv DictReader keys

Tags:

python

I am using python (2.6) csv DictReader. My input file has a header line where the column names have trailing spaces:

colname1,      colname2     ,col3, etc.
XX, YY, ZZ

The returned dict object has key() = ['colname1', 'colname2 ', 'col3']

Is there an option to trim leading and trailing spaces from the keys?

--edit

The problem arises in processing by key names:

with open(fname) as f:
   r = csv.DictReader(f)
   for row in r:
      print "processing", r["column1"], r["column2"]

The files are database dumps. And the dump program is way too smart - it adjust the output column width depending on data -- which means different sets of selects are going to have different column width and different key lengths. Sometimes I must use r['column2 '] and sometimes pad or reduce spaces. ouch!

like image 346
Dinesh Avatar asked Dec 11 '14 20:12

Dinesh


People also ask

How do I remove extra spaces from a CSV file in Python?

Create a class based on csv. DictReader , and override the fieldnames property to strip out the whitespace from each field name (aka column header, aka dictionary key).

What does CSV DictReader do in Python?

Python CSV DictReader The csv. DictReader class operates like a regular reader but maps the information read into a dictionary. The keys for the dictionary can be passed in with the fieldnames parameter or inferred from the first row of the CSV file.


4 Answers

Just read the first line manually and pass it along to the DictReader.

with open('file.csv') as fh:
    header = [h.strip() for h in fh.next().split(',')]
    reader = csv.DictReader(fh, fieldnames=header)
like image 129
Wolph Avatar answered Sep 28 '22 03:09

Wolph


You need to register a custom dialect in the csv module

csv.register_dialect('MyDialect', quotechar='"', skipinitialspace=True, quoting=csv.QUOTE_NONE, lineterminator='\n', strict=True)

then use the dialect when creating the DictReader:

my_reader = csv.DictReader(trip_file, dialect='MyDialect')

Here's all the Dialect Options

like image 31
klucar Avatar answered Sep 28 '22 03:09

klucar


Python3 version

with open('file.csv') as fh:
    header = [h.strip() for h in fh.readline().split(',')]
    reader = csv.DictReader(fh, fieldnames=header)
like image 35
Luis Avatar answered Sep 28 '22 05:09

Luis


Following in the vein of other answers, but why not use the CSV reader for that header row?

header = [h.strip() for h in next(csv.reader(f))]
reader = csv.DictReader(f, fieldnames=header)
like image 43
Kevin Avatar answered Sep 28 '22 05:09

Kevin