Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find encoding of a file via a script on Linux?

I need to find the encoding of all files that are placed in a directory. Is there a way to find the encoding used?

The file command is not able to do this.

The encoding that is of interest to me is ISO 8859-1. If the encoding is anything else, I want to move the file to another directory.

like image 533
Manglu Avatar asked Apr 30 '09 05:04

Manglu


People also ask

How do I check encoding of a file?

In Visual Studio, you can select "File > Advanced Save Options..." The "Encoding:" combo box will tell you specifically which encoding is currently being used for the file.

How do I check if a file is UTF-8 encoded Linux?

To verify if a file passes an encoding such as ascii, iso-8859-1, utf-8 or whatever then a good solution is to use the 'iconv' command.

How do I know if a file is UTF-8 or UTF 16?

There are a few options you can use: check the content-type to see if it includes a charset parameter which would indicate the encoding (e.g. Content-Type: text/plain; charset=utf-16 ); check if the uploaded data has a BOM (the first few bytes in the file, which would map to the unicode character U+FEFF - 2 bytes for ...

How do you determine character encoding?

One way to check this is to use the W3C Markup Validation Service. The validator usually detects the character encoding from the HTTP headers and information in the document. If the validator fails to detect the encoding, it can be selected on the validator result page via the 'Encoding' pulldown menu (example).


1 Answers

It sounds like you're looking for enca. It can guess and even convert between encodings. Just look at the man page.

Or, failing that, use file -i (Linux) or file -I (OS X). That will output MIME-type information for the file, which will also include the character-set encoding. I found a man-page for it, too :)

like image 128
Shalom Craimer Avatar answered Sep 20 '22 17:09

Shalom Craimer