In ruby 1.9.3-p484 I have to construct an SMPP package, but when I pass the constructed packet's content in string to the method that delivers it, a strange \xC2 value appears in the content. Having investigated the issue, I found the following interesting gotcha:
"\u008E".force_encoding("BINARY")
=> "\xC2\x8E"
Why does \u00BE become \xC2\8E when I want to use binary encoding? Why not \x00\x8E?
Because it is just forces text in binary encoding, and you have seen it as it is stored in memory. And it is stored in memory as an mbcs(Multi-Byte Character Set) data. And for chars over \x7F it become at leat two-bytes representation. So you can see:
"\u008E".force_encoding("BINARY")
# => "\xC2\x8E"
this is a binary representation. Take a look:
At Tue, 27 Jul 2010 22:21:31 +0900, Heesob Park wrote in :
I noticed String#inspect results \x{XXXX} for the encoding other than Unicode.
Is there any possibility that \x{XXXX} is accepted as an escape sequence of string?
irb(main):004:0> a = "\xC7\xD1\xB1\xDB"
This is in binary representation.
irb(main):010:0> a1 => "\x{B1DB}"
https://bugs.ruby-lang.org/issues/3619
It's on a codepoint representation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With