Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In which encoding is 0xDB a currency symbol?

I received files which, sadly, I cannot get info about how they were generated. I need to parse these files.

The file is entirely ASCII besides for one character: 0xDB (in decimal it gives 219).

Obviously (from looking at the file) this character is a currency symbol. I know it because:

  • it is mandatory for these files to contain a currency symbol anywhere an amount appears
  • there's no other currency symbol (neither $ nor euro nor nothing) nowhere in the files
  • everytime that 0xDB appears it's next to an amount

I think that in these files that 0xDB is supposed to represent the Euro symbol (it is actually very highly probable that this 0xDB appears everywhere a Euro symbol is supposed to appear).

The file command says this about the files:

ISO-8859 English text, with CRLF, LF line terminators

An hexdump gives this:

00000030  71 75 61 6e 74 20 db 32  2e 36 30 0a 20 41 49 4d  |quant .2.60. AIM|
                            ^^                                     ^

The files are all otherwise normally formatted/parsable. Actually I'm getting all the infos fine besides for that weird 0xDB character.

Does anyone know what's going on? How did a currency symbol (supposedly the euro symbol) somehow become a 0xDB?

It's neither ISO-8859-1 (aka ISO Latin 1) nor ISO-8859-15 because in both case code point 219 corresponds to 'Û' (just as Unicode codepoint 219 is 'LATIN CAPITAL LETTER U WITH CIRCUMFLEX').

It's not extended-ASCII.

like image 325
NoozNooz42 Avatar asked Dec 10 '22 11:12

NoozNooz42


1 Answers

It could be Mac OS Roman

like image 144
Jeff Ames Avatar answered Dec 28 '22 07:12

Jeff Ames