How can I traverse directories named in Japanese in Python?

Question

I'm trying to build a simple helper utility that will look through my projects and find and return the open ones to me via command line. But my calls to os.listdir return gibberish (example: '\x82\xa9\x82\xcc\x96I') whenever the folder or filename is in Japanese, and said gibberish can't be passed to the call again to get into the folder either. i.e. os.listdir('C:\Documents and Settings\\x82\xa9\x82\xcc\x96I') returns an error:

'WindowsError: [Error 3] 指定されたパスが見つかりません。'

Does anybody know how I can get around this? Thanks a lot.

'WindowsError: [Error 3] 指定されたパスが見つかりません。'

Does anybody know how I can get around this? Thanks a lot.

Fred Foo · Accepted Answer

You may need to decode the string into Unicode, then re-encode it in UTF-8 before passing it to os.listdir. It looks like your Japanese string is encoded in shift-JIS:

>>> '\x82\xa9\x82\xcc\x96I'.decode('shift-jis').encode('utf-8')
'\xe3\x81\x8b\xe3\x81\xae\xe8\x9c\x82'
>>> print '\x82\xa9\x82\xcc\x96I'.decode('shift-jis')
かの蜂

Alternatively, make use of the following feature of os.listdir to get Unicode strings out of it in the first place:

On Windows NT/2k/XP and Unix, if path is a Unicode object, the result will be a list of Unicode objects. Undecodable filenames will still be returned as string objects.

So:

os.listdir(ur'C:\Documents and Settings')
# ---------^

Björn Pollex · Answer

You should try to pass in the directory-name as Unicode-literal (u'your/path'). This way, the result is also Unicode (which is probably required to work with Japanese characters).

From the documentation:

On Windows NT/2k/XP and Unix, if path is a Unicode object, the result will be a list of Unicode objects. Undecodable filenames will still be returned as string objects.

How can I traverse directories named in Japanese in Python?

Tags:

python

character-encoding

ios

unicode

internationalization

StormShadow

2 Answers

Fred Foo

Björn Pollex

Recent Activity

Donate For Us

How can I traverse directories named in Japanese in Python?

Tags:

python

character-encoding

ios

unicode

internationalization

StormShadow

2 Answers

Fred Foo

Björn Pollex

Related questions

Recent Activity

Donate For Us