Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I determine file encoding in OS X?

People also ask

How do I know if my file is UTF-8 or ASCII?

Open the file in Notepad. Click 'Save As...'. In the 'Encoding:' combo box you will see the current file format. Yes, I opened the file in notepad and selected the UTF-8 format and saved it.

How can I tell if a file is UTF-8?

To verify if a file passes an encoding such as ascii, iso-8859-1, utf-8 or whatever then a good solution is to use the 'iconv' command.

How do I change my encoding to UTF-8?

UTF-8 Encoding in Notepad (Windows)Click File in the top-left corner of your screen. In the dialog which appears, select the following options: In the "Save as type" drop-down, select All Files. In the "Encoding" drop-down, select UTF-8.


Using the -I (that's a capital i) option on the file command seems to show the file encoding.

file -I {filename}

In Mac OS X the command file -I (capital i) will give you the proper character set so long as the file you are testing contains characters outside of the basic ASCII range.

For instance if you go into Terminal and use vi to create a file eg. vi test.txt then insert some characters and include an accented character (try ALT-e followed by e) then save the file.

They type file -I text.txt and you should get a result like this:

test.txt: text/plain; charset=utf-8


The @ means that the file has extended file attributes associated with it. You can query them using the getxattr() function.

There's no definite way to detect the encoding of a file. Read this answer, it explains why.

There's a command line tool, enca, that attempts to guess the encoding. You might want to check it out.


vim -c 'execute "silent !echo " . &fileencoding | q' {filename}

aliased somewhere in my bash configuration as

alias vic="vim -c 'execute \"silent \!echo \" . &fileencoding | q'"

so I just type

vic {filename}

On my vanilla OSX Yosemite, it yields more precise results than "file -I":

$ file -I pdfs/udocument0.pdf
pdfs/udocument0.pdf: application/pdf; charset=binary
$ vic pdfs/udocument0.pdf
latin1
$
$ file -I pdfs/t0.pdf
pdfs/t0.pdf: application/pdf; charset=us-ascii
$ vic pdfs/t0.pdf
utf-8

You can also convert from one file type to another using the following command :

iconv -f original_charset -t new_charset originalfile > newfile

e.g.

iconv -f utf-16le -t utf-8 file1.txt > file2.txt

Just use:

file -I <filename>

That's it.


Using file command with the --mime-encoding option (e.g. file --mime-encoding some_file.txt) instead of the -I option works on OS X and has the added benefit of omitting the mime type, "text/plain", which you probably don't care about.