Here is a little tmp.py with a non ASCII character:
if __name__ == "__main__":
s = 'ß'
print(s)
Running it I get the following error:
Traceback (most recent call last):
File ".\tmp.py", line 3, in <module>
print(s)
File "C:\Python32\lib\encodings\cp866.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\xdf' in position 0: character maps to <undefined>
The Python docs says:
By default, Python source files are treated as encoded in UTF-8...
My way of checking the encoding is to use Firefox (maybe someone would suggest something more obvious). I open tmp.py in Firefox and if I select View->Character Encoding->Unicode (UTF-8) it looks ok, that is the way it looks above in this question (wth ß symbol).
If I put:
# -*- encoding: utf-8 -*-
as the first string in tmp.py it does not change anything—the error persists.
Could someone help me to figure out what am I doing wrong?
Use open() to open a file with UTF-8 encoding Call open(file, encoding=None) with encoding as "UTF-8" to open file with UTF-8 encoding.
The best way to attack the problem, as with many things in Python, is to be explicit. That means that every string that your code handles needs to be clearly treated as either Unicode or a byte sequence. The most systematic way to accomplish this is to make your code into a Unicode-only clean room.
Click Tools, then select Web options. Go to the Encoding tab. In the dropdown for Save this document as: choose Unicode (UTF-8). Click Ok.
Use file. write() to write UTF-8 text to a file In a with-as statement, call open(file, mode, encoding="utf-8") with mode as "w" to open file for writing in UTF-8 encoding. Call file. write(data) to write the text contained in data to the opened file . with open("sample.
The encoding your terminal is using doesn't support that character:
>>> '\xdf'.encode('cp866')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/encodings/cp866.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character '\xdf' in position 0: character maps to <undefined>
Python is handling it just fine, it's your output encoding that cannot handle it.
You can try using chcp 65001
in the Windows console to switch your codepage; chcp
is a windows command line command to change code pages.
Mine, on OS X (using UTF-8) can handle it just fine:
>>> print('\xdf')
ß
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With