I have a bunch of .txt's that Notepad++ says (in its drop-down "Encoding" menu) are "ANSI".
They have German characters in them, [äöüß], which display fine in Notepad++.
But they don't show up right in irb when I File.read 'this is a German text example.txt'
them.
So does anyone know what argument I should give Encoding.default_external=
?
(I'm assuming that'd be the solution, right?)
When 'utf-8'
or 'cp850'
, it reads the "ANSI" file with "äöüß" in it as "\xE4\xF6\xFC\xDF"...
(Please don't hesitate to mention apparently "obvious" things in your answers; I'm pretty much as newbish as you can be and still know just enough to ask this question.)
ANSI and UTF-8 are two types of text encoding. The former is the default encoding that is used when you save text files created in Notepad, the text editor included in the Windows operating system.
ANSI encoding is a slightly generic term used to refer to the standard code page on a system, usually Windows. It is more properly referred to as Windows-1252 on Western/U.S. systems. (It can represent certain other Windows code pages on other systems.)
What they mean is probably ISO/IEC 8859-1 (aka Latin-1), ISO-8859-1, ISO/IEC 8859-15 (aka Latin-9) or Windows-1252 (aka CP 1252). All 4 of them have the ä
at position 0xE4
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With