I have a file that has an obnoxious preface to the header. So it looks like this:
Review performed by:
Meeting:
Person:
Number:
Code:
Confirmation
Tab Separated Header Names That I Want To Use
I want to skip past everything and use the tab sep header names for my code. This is what I have so far:
reader = csv.DictReader(CSVFile)
for i in range(14): #trying to skip the first 14 rows
reader.next()
for row in reader:
print(row)
if args.nextCode:
tab = (row["Tab"])
sep = int((row["Separated"]))
This code gets this error:
File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 104, in next
row = self.reader.next()
StopIteration
I tried to print the rows, to see where I was in the file, and I changed the "range(14)" to range 5, but when I print the row, I get this:
{'Review performed by:': 'Tab/tSeparated/tHeader/tNames/tThat/tI/tWant/tTo/tUse'}
Traceback (most recent call last):
File "program.py", line 396, in <module>
main()
File "program.py", line 234, in main
tab = (row["Tab"])
KeyError: 'Tab'
So I am not really sure the right way to skip those top lines. Any help would be appreciated.
In Python, while reading a CSV using the CSV module you can skip the first line using next() method.
Line 1: We import the Pandas library as a pd. Line 2: We read the csv file using the pandas read_csv module, and in that, we mentioned the skiprows=[0], which means skip the first line while reading the csv file data.
DictReader class operates like a regular reader but maps the information read into a dictionary. The keys for the dictionary can be passed in with the fieldnames parameter or inferred from the first row of the CSV file.
A csv.DictReader
reads the first line from the file when it's instantiated, to get the headers for subsequent rows. Therefore it uses Review performed by:
as the header row, then you skip the next 14 rows.
Instead, skip the lines before creating the DictReader
:
for i in range(14):
CSVFile.next()
reader = csv.DictReader(CSVFile)
...
You could wrap the CSVFile
with an itertools.islice
iterator object to slice-off the lines of the preface when creating the DictReader
, instead of the providing it directly to the constructor.
This works because the csv.reader
constructor will accept "any object which supports the iterator protocol and returns a string each time its __next__()
method is called" as its first argument according to the csv docs. This also applies to csv.DictReader
s because they're implemented via an underlying csv.reader
instance.
Note how the next(iterator).split()
expression supplies the csv.DictReader
with a fieldnames
argument (so it's not taken it from the first line of the file when it's instantiated).
iterator = itertools.islice(CSVFile, 14, None) # Skip header lines.
for row in csv.DictReader(CSVFile, next(iterator).split(), delimiter='\t'):
# process row ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With