Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

os.listdir is removing character accent

In windows file explorer, create a new txt file and name it Ń.txt (note the accent over the N).

Hold shift and right click the folder where you created Ń.txt and select open command window here (or alternatively open cmd.exe and cd into the directory where you created the file).

Run in python terminal:

print os.listdir(".")  #note that the file is displayed as "N.txt"
print map(os.path.exists,os.listdir(".")) #note the file doesn't exist???

I have tried many decodings but os.listdir is not returning the bytestring of the actual filename at all, so encoding/decoding the incorrect bytes is still the incorrect bytes.

like image 437
Joran Beasley Avatar asked Feb 06 '14 20:02

Joran Beasley


1 Answers

Use u before that:

>>> print os.listdir(u".")
[u'\u0143.txt']
>>> print map(os.path.exists,os.listdir(u"."))
[True]

os.listdir(path):

Changed in version 2.3: On Windows NT/2k/XP and Unix, if path is a Unicode object, the result will be a list of Unicode objects. Undecodable filenames will still be returned as string objects.

like image 65
Omid Raha Avatar answered Sep 19 '22 00:09

Omid Raha