I have a long text file which uses apparently different encodings in subsequent blocks of text (iso or utf-8). It is the result of appending text using >> file.bib
and copy and paste from different sources (webpages).
The blocks can in principle be distinguished as they are bibtex
entries
@article{key, author={lastname, firstname}, ...}
I would like to convert it to a coherent utf-8 file since it seems to crash my bibtex viewer (kbibtex). I know that I can use iconv
to convert the encoding of entire files, but I would like to know if there is a way to fix my file without corrupting some of the entries.
Go to "File" -> "Options" -> "Advanced" and scroll down until the "General" section is reached. In the "General" section, check the box that says "Confirm file format conversion on open." Exit Word, and reopen the corrupt document again. The dialogue box will appear.
If you can assume uniform encoding for each line AND you know the alternate encoding:
#!/usr/bin/perl
use Encode;
while(<>) {
my $line;
eval {
$line=Encode::decode_utf8( $_ );
}
if ($@) $line=Encode::decode( 'iso-8859-1', $_ ); #not UTF-8
# Now $line is UNICODE.Do something to it
}
You can still do the same by words if the lines are mixed encoding, but you still know what is the alternate encoding. If do not know the alternate encoding, or if you have more than one, you need to use some encode-guessing library, which may well guess wrong.
I use vim for this, but I guess it can be done in any editor.
Select (shift+v) a block of text that you want to change encoding on.
type :!enca -L lang - (replace 'lang' with your language, I use 'enca -L cs'. enca utility should then tell you the most probable encoding of the selected block)
press u (so you undo the answer of enca that appeared in your text)
select the block again, this time running :!iconv -f determined_encoding -t UTF-8
Note that vim automatically expands pressed : to :\<,> when you're in visual mode, which is exactly what you want for running programs on text blocks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With