Chinese and Japanese character support in python

Tags:

1 Answers

Please do read the Python Unicode HOWTO; it explains how to process and include non-ASCII text in your Python code.

If you want to include Japanese text literals in your code, you have several options:

Use unicode literals (create unicode objects instead of byte strings), but any non-ascii codepoint is represented by a unicode escape character. They take the form of \uabcd, so a backslash, a u and 4 hexadecimal digits:
```
ru = u'\u30EB'
```
would be one character, the katakana 'ru' codepoint ('ル').
Use unicode literals, but include the characters in some form of encoding. Your text editor will save files in a given encoding (say, UTF-16); you need to declare that encoding at the top of the source file:
```
# encoding: utf-16

ru = u'ル'
```
where 'ル' is included without using an escape. The default encoding for Python 2 files is ASCII, so by declaring an encoding you make it possible to use Japanese directly.
Use byte string literals, ready encoded. Encode the codepoints by some other means and include them in your byte string literals. If all you are going to do with them is use them in encoded form anyway, this should be fine:
```
ru = '\xeb\x30'  # ru encoded to UTF16 little-endian
```
I encoded 'ル' to UTF-16 little-endian because that's the default Windows NTFS filename encoding.

Next problem will be your terminal, the Windows console is notorious for not supporting many character sets out of the box. You probably want to configure it to handle UTF-8 instead. See this question for some details, but you need to run the following command in the console:

chcp 65001

to switch to UTF-8, and you may need to switch to a console font that can handle your codepoints (Lucida perhaps?).

135

answered Oct 15 '22 18:10

Martijn Pieters

Related questions
                            
                                How to convert json string to dictionary and save order in keys? [duplicate]
                            
                                python regular expression with utf8 issue
                            
                                jinja2 nested variables
                            
                                Appending line to a existing file having extra new line in Python
                            
                                python & smtplib: Is sending mail via gmail using oauth2 possible?
                            
                                Should PostgreSQL connections be pooled in a Python web app, or create a new connection per request?
                            
                                Python Nested lists and iteration
                            
                                python int( ) function
                            
                                Python: Checking If Coordinates Are Within Circle [duplicate]
                            
                                Why does Python (IronPython) report "Illegal characters in path" when the word bin is used?
                            
                                Changing the display order of forms in a formset
                            
                                How do you save each element of a list in a .txt file, one per line?
                            
                                python passlib: what is the best value for "rounds"
                            
                                List home directory without absolute path
                            
                                How to "overload" python's print function "globally"?
                            
                                igraph: how to use add_edges when there are attributes?
                            
                                merge querysets in django
                            
                                How can I redirect the output of unittest? Obvious solution doesn't work
                            
                                How do I get the vertices on the shortest path using igraph?
                            
                                git merge conflicts with our database file (multiple developers)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Chinese and Japanese character support in python

Tags:

python

python-2.5

user2030113

People also ask

1 Answers

Martijn Pieters

Recent Activity

Donate For Us