Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why I cannot save file with Chinese characters when using Python 2.7.11 IDLE?

I just downloaded the latest Python 2.7.11 64bit from its official website and installed it to my Windows 10. And I found that if the new IDLE file contains Chinese character, like 你好, then I cannot save the file. If I tried to save it for several times, then the new file crashed and disappeared.

I also installed the latest python-3.5.1-amd64.exe, and it does not have this issue.

How to solve it?

More: A example code from wiki page, https://zh.wikipedia.org/wiki/%E9%B8%AD%E5%AD%90%E7%B1%BB%E5%9E%8B

If I past the code here, StackOverflow alays warn me: Body cannot contain "I just dow". Why?

Thanks!

enter image description here

More: I find this config option, but it does not help at all. IDLE -> Options -> Configure IDLE -> General -> Default Source Encoding: UTF-8

More: By adding u before the Chinese code, everything will be right, it is great way. Like below: enter image description here

Without u there, sometimes it will go with corrupted code. Like below: enter image description here

like image 332
Tom Xue Avatar asked Sep 25 '22 20:09

Tom Xue


2 Answers

Python 2.x uses ASCII as default encoding, while Python 3.x uses UTF-8. Just use:
my_string.encode("utf-8")
to convert ascii to utf-8 (or change it to any other encoding you need)

You can also try to put this line on the first line of your code:

# -*- coding: utf-8 -*-
like image 114
Tales Pádua Avatar answered Oct 11 '22 06:10

Tales Pádua


Python 2 uses ASCII as its default encoding for its strings which cannot store Chinese characters. On the other hand, Python 3 uses Unicode encoding for its strings by default which can store Chinese characters.

But that doesn't mean Python 2 cannot use Unicode strings. You just have to encode your strings into Unicode. Here's an example of converting your strings to Unicode strings.

>>> plain_text = "Plain text"
>>> plain_text
'Plain text'
>>> utf8_text = unicode(plain_text, "utf-8")
>>> utf8_txt
u'Plain_text'

The prefix u in the string, utf8_txt, says that it is a Unicode string.

You could also do this.

>>> print u"你好"
>>> 你好

You just have to prepend your string with u to signify that it is a Unicode string.

like image 39
Sean Francis N. Ballais Avatar answered Oct 11 '22 08:10

Sean Francis N. Ballais