Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TypeError: encoding or errors without a string argument

I'm trying to write a list of data bytes to a CSV file. Since it's a list of byte strings, I used the below code:

with open(r"E:\Avinash\Python\extracting-drug-data\out.csv", "wb") as w:
    writer = csv.writer(w)
    writer.writerows(bytes(datas, 'UTF-8'))

But it results in the following error:

TypeError: encoding or errors without a string argument

datas is a list of byte strings.

print(datas)

yields

[b'DB08873', b' MOLSDFPDBSMILESInChIView Structure \xc3\x97Structure for DB08873 (Boceprevir) Close', b'394730-60-0', b'LHHCSNFAOIFYRV-DOVBMPENSA-N', b'Organic acids and derivatives  ', b'Food increases exposure of boceprevir by up to 65% relative to fasting state. However, type of food and time of meal does not affect bioavailability of boceprevir and thus can be taken without regards to food.  \r\nTmax = 2 hours;\r\nTime to steady state, three times a day dosing = 1 day;\r\nCmax]

I want the above list to be printed as first row in a CSV file with the decoding of Unicode chars. That is, \xc3\x97 should be converted to it's corresponding character.

like image 345
Avinash Raj Avatar asked Jun 08 '15 14:06

Avinash Raj


2 Answers

It seems your datas is already in bytes format, so to turn it into UTF-8 strings, you have to use str, not bytes! Also, you have to convert each element from datas individually, not the entire list at once. Finally, if you want to add datas as one row to out.csv, you have to use writerow, whereas writerows would write all the rows at once, and accordinly would expect a list of lists.

Depending on your OS, you might also have to specify the encoding when opening the file. Otherwise it will use the OS' default encoding, which might be something entirely different.

This seems to do what you want. The result is a CSV file with one row1 of data in UTF-8 format, and the \xc3\x97 is decoded to ×.

import csv
with open(r"out.csv", "w", encoding='UTF-8') as w:
    writer = csv.writer(w)
    writer.writerow([str(d, 'UTF-8') for d in datas])

1) Note that the last item in datas contains some line breaks, and thus will be split onto several lines. This is probably not what you want. Or is this a glitch in your datas list?

like image 106
tobias_k Avatar answered Nov 20 '22 15:11

tobias_k


This error just means the thing you're passing to bytes (the string you want converted to a byte sequence) is not in fact a string. It does not specifically mean that the argument is already of type bytes, just that it isn't a string.

>>> bytes(b"", encoding="utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: encoding without a string argument
>>> bytes(None, encoding="utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: encoding without a string argument
>>> bytes(12, encoding="utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: encoding without a string argument
like image 21
Jack M Avatar answered Nov 20 '22 13:11

Jack M