Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set up Python 3 (and cmd.exe) encoding properly?

Im trying to print a smiley in Python: ☺

It works without any problems in the interactive shell (inside cmd.exe)

Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit (AM
D64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("☺")
☺

But if I try the same thing out of an file I get this error:

Traceback (most recent call last):
  File "main.py", line 8, in <module>
    print("\u263a")
  File "C:\dev\lang\Python34\lib\encodings\cp850.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u263a' in position
0: character maps to <undefined>

The Python-File is UTF-8 encoded.


Update:

Even if there isn't a real answer to my problem yet, it's worth to read the comments under the question. I also created a list of all printable characters with the default raster font of the cmd.exe (tested on Windows 10). To print a char simply use the chr() function. For example chr(14) gives you

0       [space]
1       ☺
2       ☻
3       ♥
4       ♦
5       ♣
6       ♠
7       [nothing]
8       [backspace, removes char before]
9       [tabulator]
10      [newline]
11      ♂
12      ♀
13      [takes part after chr(13) and replaces begin of string with it]
14      ♫
15      ☼
16      ►
17      ◄
18      ↕
19      ‼
20      ¶
21      §
22      ▬
23      ↨
24      ↑
25      ↓
26      →
27      ←
28      ∟
29      ↔
30      ▲
31      ▼
32      [space]
33      !
34      "
35      #
36      $
37      %
38      &
39      '
40      (
41      )
42      *
43      +
44      ,
45      -
46      .
47      /
48      0
49      1
50      2
51      3
52      4
53      5
54      6
55      7
56      8
57      9
58      :
59      ;
60      <
61      =
62      >
63      ?
64      @
65      A
66      B
67      C
68      D
69      E
70      F
71      G
72      H
73      I
74      J
75      K
76      L
77      M
78      N
79      O
80      P
81      Q
82      R
83      S
84      T
85      U
86      V
87      W
88      X
89      Y
90      Z
91      [
92      \
93      ]
94      ^
95      _
96      `
97      a
98      b
99      c
100     d
101     e
102     f
103     g
104     h
105     i
106     j
107     k
108     l
109     m
110     n
111     o
112     p
113     q
114     r
115     s
116     t
117     u
118     v
119     w
120     x
121     y
122     z
123     {
124     |
125     }
126     ~
127     ⌂
160     [space]
161     ¡
162     ¢
163     £
164     ¤
165     ¥
166     ¦
167     §
168     ¨
169     ©
170     ª
171     «
172     ¬
173     ­[shorter -, can't be displayed outside of console]
174     ®
175     ¯
176     °
177     ±
178     ²
179     ³
180     ´
181     µ
182     ¶
183     ·
184     ¸
185     ¹
186     º
187     »
188     ¼
189     ½
190     ¾
191     ¿
192     À
193     Á
194     Â
195     Ã
196     Ä
197     Å
198     Æ
199     Ç
200     È
201     É
202     Ê
203     Ë
204     Ì
205     Í
206     Î
207     Ï
208     Ð
209     Ñ
210     Ò
211     Ó
212     Ô
213     Õ
214     Ö
215     ×
216     Ø
217     Ù
218     Ú
219     Û
220     Ü
221     Ý
222     Þ
223     ß
224     à
225     á
226     â
227     ã
228     ä
229     å
230     æ
231     ç
232     è
233     é
234     ê
235     ë
236     ì
237     í
238     î
239     ï
240     ð
241     ñ
242     ò
243     ó
244     ô
245     õ
246     ö
247     ÷
248     ø
249     ù
250     ú
251     û
252     ü
253     ý
254     þ
255     ÿ
305     ı
402     ƒ
8215    ‗
9472    ─
9474    │
9484    ┌
9488    ┐
9492    └
9496    ┘
9500    ├
9508    ┤
9516    ┬
9524    ┴
9532    ┼
9552    ═
9553    ║
9556    ╔
9559    ╗
9562    ╚
9565    ╝
9568    ╠
9571    ╣
9574    ╦
9577    ╩
9580    ╬
9600    ▀
9604    ▄
9608    █
9617    ░
9618    ▒
9619    ▓
9632    ■
like image 855
Daveman Avatar asked Aug 20 '15 19:08

Daveman


People also ask

How do I fix encoding in Python?

The best way to attack the problem, as with many things in Python, is to be explicit. That means that every string that your code handles needs to be clearly treated as either Unicode or a byte sequence. The most systematic way to accomplish this is to make your code into a Unicode-only clean room.

What encoding does Python 3 use?

String Encoding Since Python 3.0, strings are stored as Unicode, i.e. each character in the string is represented by a code point. So, each string is just a sequence of Unicode code points. For efficient storage of these strings, the sequence of code points is converted into a set of bytes.

How do I change Unicode in CMD?

To start it from the taskbar or anywhere else, make a shortcut (you can name it cmd. unicode.exe or whatever you like) and change its Target to C:\Windows\System32\cmd.exe /K chcp 65001 . You meant "cmd. unicode.

How to set up command prompt for Python in Windows 10?

Click on the “Environment Variables” Now double click on the “path” in the “System Variables” In the “Edit System Variable” menu click on “new”, then paste the file location you copied and click ok. Now close the Environment menus by clicking ok and congratulation, we have set up the Command Prompt for python.

How does Python know what encoding to use when redirecting to files?

When you redirect to a file Python doesn't know what encoding to use. Redirecting to a file is a shell operation, and Python understands a shell variable that indicates the encoding to use. Set the following environment variable before redirecting to a file:

How to run Python programs on command line?

But we can also run python programs on CMD or command prompt as CMD is the default command-line interpreter on Windows. But there’s a need to set up the environment variable in windows to use python on the command-line.

What is the default console encoding in Python?

The default console encoding is not UTF-8. As you can see from your own error message, Python uses the current console code page which on your system was cp850. Thanks for contributing an answer to Stack Overflow!


1 Answers

When you redirect to a file Python doesn't know what encoding to use. Redirecting to a file is a shell operation, and Python understands a shell variable that indicates the encoding to use. Set the following environment variable before redirecting to a file:

PYTHONIOENCODING=utf8
like image 135
Mark Tolonen Avatar answered Nov 14 '22 23:11

Mark Tolonen