Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UTF-8 Filenames return Not Found in linux terminal

I have a problem with some files in linux (Ubuntu) terminal, with accents in the names. For example:

$ ls dir/
criação.png 

So, the terminal returns that file, so it exists. Now lets look if the file exists, with this simple command:

$ [ -f criação.png ] &&  echo "File Exist" || echo "Not Exist"
Not Exist

As you can see, "Not Exist". Now, i have the same folder and file on OSX, then I run the same command and it returns this:

$ [ -f criação.png ] &&  echo "File Exist" || echo "Not Exist"
File Exist

I know a little about locale:

$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE=en_US.UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8" 
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

On linux, "Not Exist", on OSX, "File Exist"... Someone, know how to fix that?

Maybe these can help:
http://nedbatchelder.com/blog/201106/filenames_with_accents.html
http://www.ruby-forum.com/topic/279105

UPDATE - Solution

I finally found a solution to that problem. You need to rename your files from NFD to NFC, here is the command to fix all files:

cd dir/
convmv -r -i -f utf8 -t utf8 --nfc --notest .

Source: http://blog.hbis.fr/2010/08/30/macox-utf8_filenames_normalization/

like image 496
mateussouzaweb Avatar asked Nov 08 '12 17:11

mateussouzaweb


People also ask

How do I enable UTF-8 in Linux?

Using the arrow key, navigate up and down to choose en_US-UTF-8 or any other UTF-8 locale. After that, again, it will ask you to select the default locale. On this screen, also select en_US. UTF-8.

How do I check if a file is UTF-8 encoded Linux?

To verify if a file passes an encoding such as ascii, iso-8859-1, utf-8 or whatever then a good solution is to use the 'iconv' command.

How do I check my UTF-8 format?

Open the file in Notepad. Click 'Save As...'. In the 'Encoding:' combo box you will see the current file format. Yes, I opened the file in notepad and selected the UTF-8 format and saved it.

How do I check if a UTF-8 file is valid?

$ iconv -f UTF-8 your_file > /dev/null; echo $? The command will return 0 if the file could be converted successfully, and 1 if not. Additionally, it will print out the byte offset where the invalid byte sequence occurred. Edit: The output encoding doesn't have to be specified, it will be assumed to be UTF-8.


1 Answers

One of the reasons might be the file name uses a different unicode normalization form of characters with combining marks than you use to type the name. See Unicode Equivalence.

like image 93
choroba Avatar answered Sep 28 '22 07:09

choroba