Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python csv DictReader with optional header

import csv
import io


FIELDNAMES = ['one', 'two', 'three']


def print_data_rows(csvfile):
    reader = csv.DictReader(csvfile, fieldnames=FIELDNAMES)
    for row in reader:
        print(row)


headerless = r'''
1,2,3
'''
print_data_rows(io.StringIO(headerless.strip()))



headerful = r'''
one,two,three
1,2,3
'''
print_data_rows(io.StringIO(headerful.strip()))

I would like the output to be

{'one': '1', 'two': '2', 'three': '3'}
{'one': '1', 'two': '2', 'three': '3'}

but actually the output is

{'one': '1', 'two': '2', 'three': '3'}
{'one': 'one', 'two': 'two', 'three': 'three'}
{'one': '1', 'two': '2', 'three': '3'}

because DictReader is not skipping the header row, when it exists.

How can I skip the header row if it exists?

like image 396
Nils Avatar asked May 11 '26 14:05

Nils


1 Answers

You may be able to use a csv.Sniffer here.

The has_header method peeks at the data and uses heuristics to determine whether a header is present (refer to the doc for the exact logic).

Note that in the example data shown in the question, the sniffer heuristic would incorrectly consider both headerless and headerful to have a header. That's probably a consequence of your sample data having only a single row, if I add a second numeric row 4,5,6 in the input data then the Sniffer.has_header method works as expected.

A basic implementation which assumes csvfile is seekable could look like this:

def print_data_rows(csvfile):
    reader = csv.DictReader(csvfile, fieldnames=FIELDNAMES)
    sniffer = csv.Sniffer()
    sample = csvfile.read(1024)
    csvfile.seek(0)
    if sniffer.has_header(sample):
        next(reader)
    for row in reader:
        print(row)

It's easy to adapt if your stream isn't seekable, just buffer the first line.

like image 117
wim Avatar answered May 13 '26 03:05

wim



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!