I received files which, sadly, I cannot get info about how they were generated. I need to parse these files.
The file is entirely ASCII besides for one character: 0xDB (in decimal it gives 219).
Obviously (from looking at the file) this character is a currency symbol. I know it because:
I think that in these files that 0xDB is supposed to represent the Euro symbol (it is actually very highly probable that this 0xDB appears everywhere a Euro symbol is supposed to appear).
The file command says this about the files:
ISO-8859 English text, with CRLF, LF line terminators
An hexdump gives this:
00000030 71 75 61 6e 74 20 db 32 2e 36 30 0a 20 41 49 4d |quant .2.60. AIM|
^^ ^
The files are all otherwise normally formatted/parsable. Actually I'm getting all the infos fine besides for that weird 0xDB character.
Does anyone know what's going on? How did a currency symbol (supposedly the euro symbol) somehow become a 0xDB?
It's neither ISO-8859-1 (aka ISO Latin 1) nor ISO-8859-15 because in both case code point 219 corresponds to 'Û' (just as Unicode codepoint 219 is 'LATIN CAPITAL LETTER U WITH CIRCUMFLEX').
It's not extended-ASCII.
It could be Mac OS Roman
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With