Best way to convert text files between character sets?

Try VIM

If you have vim you can use this:

Not tested for every encoding.

The cool part about this is that you don't have to know the source encoding

vim +"set nobomb | set fenc=utf8 | x" filename.txt

Be aware that this command modify directly the file

Explanation part!

+ : Used by vim to directly enter command when opening a file. Usualy used to open a file at a specific line: vim +14 file.txt
| : Separator of multiple commands (like ; in bash)
set nobomb : no utf-8 BOM
set fenc=utf8 : Set new encoding to utf-8 doc link
x : Save and close file
filename.txt : path to the file
" : qotes are here because of pipes. (otherwise bash will use them as bash pipe)

Under Linux you can use the very powerful recode command to try and convert between the different charsets as well as any line ending issues. recode -l will show you all of the formats and encodings that the tool can convert between. It is likely to be a VERY long list.

iconv(1)

iconv -f FROM-ENCODING -t TO-ENCODING file.txt

Also there are iconv-based tools in many languages.

Related questions
                            
                                Extracting text from HTML file using Python
                            
                                SQL Server Text type vs. varchar data type [closed]
                            
                                How can I read and parse CSV files in C++?
                            
                                How can I detect the encoding/codepage of a text file
                            
                                Input text dialog Android
                            
                                Matplotlib scatter plot with different text at each data point
                            
                                What is "entropy and information gain"?
                            
                                jQuery: find element by text
                            
                                How to search a string in multiple files and return the names of files in Powershell?
                            
                                How to replace multiple substrings of a string?
                            
                                How to stop text from taking up more than 1 line?
                            
                                How can I replace text with CSS?
                            
                                Difference between VARCHAR and TEXT in MySQL [duplicate]
                            
                                How to print color in console using System.out.println?
                            
                                Using .text() to retrieve only text not nested in child tags
                            
                                Android TextView Justify Text
                            
                                How do you change text to bold in Android?
                            
                                How can I insert a line break into a <Text> component in React Native?
                            
                                How to wrap text in LaTeX tables?
                            
                                Android: combining text & image on a Button or ImageButton

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Best way to convert text files between character sets?

Tags:

text

unicode

utf-8

character-set

People also ask

Try VIM

Explanation part!

Recent Activity

Donate For Us