I'm using Python 2.7.12. With this code snippet I'm saving a utf-8 csv file. I wrote the BOM (byte order mark) at the beginning of the file.
import codecs
import csv
outputFile = open("test.csv", "wb")
outputFile.write(codecs.BOM_UTF8)
fieldnames = ["a", "b"]
writer = csv.DictWriter(outputFile, fieldnames, delimiter=";")
writer.writeheader()
row = dict([])
for i in range(10):
row["a"] = str(i).encode("utf-8")
row["b"] = str(i*2).encode("utf-8")
writer.writerow(row)
outputFile.close()
I want to load that csv file:
import codecs
import csv
inputFile = open("test.csv", "rb")
reader = csv.DictReader(inputFile, delimiter=";")
for row in reader:
print row["a"]
inputFile.close()
The above code is going to fail: KeyError: 'a'
If I print the row keys this is how they look: [u'\ufeffa', u'b']
. The BOM has been embedded into the key a
. What am I doing wrong?
Step 1: In order to read rows in Python, First, we need to load the CSV file in one object. So to load the csv file into an object use open() method. Step 2: Create a reader object by passing the above-created file object to the reader function. Step 3: Use for loop on reader object to get each row.
The ÿþ character is known as the byte order marking (BOM) character and is commonly found as the first line of a CSV file. ÿþ can not be seen when the CSV is opened with Notepad or Excel for that an Editor is required that can display the BOM (Byte Order Mark).
In Python, while reading a CSV using the CSV module you can skip the first line using next() method.
You have to tell open that this is UTF-8 with BOM. I know that works with io.open:
import io
.
.
.
inputFile = io.open("test.csv", "r", encoding='utf-8-sig')
.
.
.
And you have to open the file in text mode, "r" instead of "rb".
In Python 3, the built-in open
function is an alias for io.open
.
All you need to open a file encoded as UTF-8 with BOM:
open(path, newline='', encoding='utf-8-sig')
import csv
...
with open(path, newline='', encoding='utf-8-sig') as csv_file:
reader = csv.DictReader(csv_file, dialect='excel')
for row in reader:
print(row['first_name'], row['last_name'])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With