import csv
import io
FIELDNAMES = ['one', 'two', 'three']
def print_data_rows(csvfile):
reader = csv.DictReader(csvfile, fieldnames=FIELDNAMES)
for row in reader:
print(row)
headerless = r'''
1,2,3
'''
print_data_rows(io.StringIO(headerless.strip()))
headerful = r'''
one,two,three
1,2,3
'''
print_data_rows(io.StringIO(headerful.strip()))
I would like the output to be
{'one': '1', 'two': '2', 'three': '3'}
{'one': '1', 'two': '2', 'three': '3'}
but actually the output is
{'one': '1', 'two': '2', 'three': '3'}
{'one': 'one', 'two': 'two', 'three': 'three'}
{'one': '1', 'two': '2', 'three': '3'}
because DictReader is not skipping the header row, when it exists.
How can I skip the header row if it exists?
You may be able to use a csv.Sniffer here.
The has_header method peeks at the data and uses heuristics to determine whether a header is present (refer to the doc for the exact logic).
Note that in the example data shown in the question, the sniffer heuristic would incorrectly consider both headerless and headerful to have a header. That's probably a consequence of your sample data having only a single row, if I add a second numeric row 4,5,6 in the input data then the Sniffer.has_header method works as expected.
A basic implementation which assumes csvfile is seekable could look like this:
def print_data_rows(csvfile):
reader = csv.DictReader(csvfile, fieldnames=FIELDNAMES)
sniffer = csv.Sniffer()
sample = csvfile.read(1024)
csvfile.seek(0)
if sniffer.has_header(sample):
next(reader)
for row in reader:
print(row)
It's easy to adapt if your stream isn't seekable, just buffer the first line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With