I am on Mac OS X 10.8.2
When I try to find files with filenames that contain non-ASCII-characters I get no results although I know for sure that they are existing. Take for example the console input
> find */Bärlauch*
I get no results. But if I try without the umlaut I get
> find */B*rlauch*
images/Bärlauch1.JPG
So the file is definitely existing. If I rename the file replacing 'ä' by 'ae' the file is being found.
Similarily the Python module glob
is not able to find the file:
>>> glob.glob('*/B*rlauch*')
['images/Bärlauch1.JPG']
>>> glob.glob('*/Bärlauch*')
[]
I figured out it must have something to do with the encoding but my terminal is set to be utf-8 and I am using Python 3.3.0 which uses unicode strings.
Mac OS X uses denormalized characters always for filenames on HFS+. Use unicodedata.normalize('NFD', pattern)
to denormalize the glob pattern.
import unicodedata
glob.glob(unicodedata.normalize('NFD', '*/Bärlauch*'))
Python programs are fundamentally text files. Conventionally, people write them using only characters from the ASCII character set, and thus do not have to think about the encoding they write them in: all character sets agree on how ASCII characters should be decoded.
You have written a Python program using a non-ASCII character. Your program thus comes with an implicit encoding (which you haven't mentioned): to save such a file, you have to decide how you are going to represent a-umlaut on disk. I would guess that perhaps your editor has chosen something non-Unicode for you.
Anyway, there are two ways around such a problem: either you can restrict yourself to using only ASCII characters in the source code of your program, or you can declare to Python that you want it to read the text file with a specific encoding.
To do the former, you should replace the a-umlaut with its Unicode escape sequence (which I think is \x0228
but can't test at the moment). To do the latter, you should add a coding declaration at the top of the file:
# -*- coding: <your encoding> -*-
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With