I'm trying to build a simple helper utility that will look through my projects and find and return the open ones to me via command line. But my calls to os.listdir
return gibberish (example: '\x82\xa9\x82\xcc\x96I'
) whenever the folder or filename is in Japanese, and said gibberish can't be passed to the call again to get into the folder either. i.e. os.listdir('C:\Documents and Settings\\x82\xa9\x82\xcc\x96I')
returns an error:
'WindowsError: [Error 3] 指定されたパスが見つかりません。'
Does anybody know how I can get around this? Thanks a lot.
You may need to decode the string into Unicode, then re-encode it in UTF-8 before passing it to os.listdir
. It looks like your Japanese string is encoded in shift-JIS:
>>> '\x82\xa9\x82\xcc\x96I'.decode('shift-jis').encode('utf-8')
'\xe3\x81\x8b\xe3\x81\xae\xe8\x9c\x82'
>>> print '\x82\xa9\x82\xcc\x96I'.decode('shift-jis')
かの蜂
Alternatively, make use of the following feature of os.listdir
to get Unicode strings out of it in the first place:
On Windows NT/2k/XP and Unix, if path is a Unicode object, the result will be a list of Unicode objects. Undecodable filenames will still be returned as string objects.
So:
os.listdir(ur'C:\Documents and Settings')
# ---------^
You should try to pass in the directory-name as Unicode-literal (u'your/path'
). This way, the result is also Unicode (which is probably required to work with Japanese characters).
From the documentation:
On Windows NT/2k/XP and Unix, if path is a Unicode object, the result will be a list of Unicode objects. Undecodable filenames will still be returned as string objects.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With