Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change encoding in many files?

Tags:

linux

bash

I try this:

find . -exec iconv -f iso8859-2 -t utf-8 {} \;

but output goes to the screen, not to the same file. How to do it?

like image 841
Nips Avatar asked Feb 16 '12 11:02

Nips


People also ask

What UTF-8 means?

UTF-8 (UCS Transformation Format 8) is the World Wide Web's most common character encoding. Each character is represented by one to four bytes. UTF-8 is backward-compatible with ASCII and can represent any standard Unicode character.


3 Answers

Try this:

find . -type f -print -exec iconv -f iso8859-2 -t utf-8 -o {}.converted {} \; -exec mv {}.converted {} \;

It will use temp file with '.converted' suffix (extension) and then will move it to original name, so be careful if you have files with '.converted' suffixes (I don't think you have).

Also this script is not safe for filenames containing spaces, so for more safety you should double-quote: "{}" instead of {} and "{}.converted" instead of {}.converted

like image 143
wobmene Avatar answered Oct 19 '22 19:10

wobmene


read about enconv.
If you need to convert to your current terminal encoding you can do it like that:

find . -exec enconv -L czech {}\;

Or exactly what you wanted:

find . -exec enconv -L czech -x utf8 {}\;
like image 40
2r2w Avatar answered Oct 19 '22 19:10

2r2w


I found this method worked well for me, especially where I had multiple file encodings and multiple file extensions.

Create a vim script called script.vim:

set bomb
set fileencoding=utf-8
wq

Then run the script on the file extensions you wish to target:

find . -type f \( -iname "*.html" -o -iname "*.htm" -o -iname "*.php" -o -iname "*.css" -o -iname "*.less" -o -iname "*.js" \) -exec vim -S script.vim {} \;
like image 1
Sebastien Avatar answered Oct 19 '22 17:10

Sebastien