I am currently inserting data in my django models using csv file. Below is a simple save function that am using:
def save(self):
myfile = file.csv
data = csv.reader(myfile, delimiter=',', quotechar='"')
i=0
for row in data:
if i == 0:
i = i + 1
continue #skipping the header row
b=MyModel()
b.create_from_csv_row(row) # calls a method to save in models
The function is working perfectly with ascii characters. However, if the csv file has some non-ascii characters then, an error is raised: UnicodeDecodeError 'ascii' codec can't decode byte 0x93 in position 1526: ordinal not in range(128)
My question is: How can i remove non-ascii characters before saving my csv file to avoid this error.
Thanks in advance.
If you really want to strip it, try:
import unicodedata
unicodedata.normalize('NFKD', title).encode('ascii','ignore')
* WARNING THIS WILL MODIFY YOUR DATA * It attempts to find a close match - i.e. ć -> c
Perhaps a better answer is to use unicodecsv instead.
----- EDIT ----- Okay, if you don't care that the data is represented at all, try the following:
# If row references a unicode string
b.create_from_csv_row(row.encode('ascii', 'ignore'))
If row is a collection, not a unicode string, you will need to iterate over the collection to the string level to re-serialize it.
If you want to remove non-ascii characters from your data then iterate through your data and keep only the ascii.
for item in data:
if ord(item) <= 128: # 1 - 128 is ascii
[append,write,print,whatever]
If you want to convert unicode characters to ascii, then the response above by DivinusVox is accurate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With