import csv with open('thefile.csv', 'rb') as f: data = list(csv.reader(f)) import collections counter = collections.defaultdict(int) for row in data: counter[row[10]] += 1 with open('/pythonwork/thefile_subset11.csv', 'w') as outfile: writer = csv.writer(outfile) for row in data: if counter[row[10]] >= 504: writer.writerow(row)
This code reads thefile.csv
, makes changes, and writes results to thefile_subset1
.
However, when I open the resulting csv in Microsoft Excel, there is an extra blank line after each record!
Is there a way to make it not put an extra blank line?
I just checked: Python's CSV parser ignores empty lines. I guess that's reasonable. Yes, I agree an empty line within a quoted field means a literal empty line.
The way Python handles newlines on Windows can result in blank lines appearing between rows when using csv. writer . In Python 2, opening the file in binary mode disables universal newlines and the data is written properly.
In Python 2, open outfile
with mode 'wb'
instead of 'w'
. The csv.writer
writes \r\n
into the file directly. If you don't open the file in binary mode, it will write \r\r\n
because on Windows text mode will translate each \n
into \r\n
.
In Python 3 the required syntax changed and the csv
module now works with text mode 'w'
, but also needs the newline=''
(empty string) parameter to suppress Windows line translation (see documentation links below).
# Python 2 with open('/pythonwork/thefile_subset11.csv', 'wb') as outfile: writer = csv.writer(outfile) # Python 3 with open('/pythonwork/thefile_subset11.csv', 'w', newline='') as outfile: writer = csv.writer(outfile)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With