Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Special case to grab the headers for a DictReader in Python

Normally the csv.DictReader will use the first line of a .csv file as the column headers, i.e. the keys to the dictionary:

If the fieldnames parameter is omitted, the values in the first row of the csvfile will be used as the fieldnames.

However, I am faced with something like this for my first line:

#Format: header1 header2 header3 ...etc.

The #Format: needs to be skipped, as it is not a column header. I could do something like:

column_headers = ['header1', 'header2', 'header3']
reader = csv.dictReader(my_file, delimiter='\t', fieldnames=column_headers)

But I would rather have the DictReader handle this for two reason.

  1. There are a lot of columns

  2. The column names may change over time, and this is a quarterly-run process.

Is there some way to have the DictReader still use the first line as the column headers, but skip that first #Format: word? Or really any word that starts with a # would probably suffice.

like image 818
Houdini Avatar asked Aug 31 '25 05:08

Houdini


1 Answers

As DictReader wraps an open file, you could read the first line of the file, parse the headers from there (headers = my_file.readline().split(delimiter)[1:], or something like that), and then pass them to DictReader() as the fieldnames argument. The DictReader constructor does not reset the file, so you don't have to worry about it reading in the header list after you've parsed that.

like image 81
JAB Avatar answered Sep 02 '25 19:09

JAB