Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Binary string literals in Ruby 2.0

Tags:

ruby

encoding

When upgrading to Ruby 2.0, a test case started to fail:

expected = "\xD1\x9B\x86"
assert_equal expected, actual

with the following message:

<"ћ\x86"> expected but was
<"\xD1\x9B\x86">.

The actual variable contains a binary string obtained from an external library call.

The problem is that the default encoding of source files (and therefore string literals) changed in Ruby 2.0 from US-ASCII to UTF-8.

like image 659
robinst Avatar asked Apr 05 '13 21:04

robinst


People also ask

What is a string literal in Ruby?

String literals are simply strings that are created using Ruby's built-in string syntax. This includes those created using double quotes (“), single quotes ('), or the special forms %Q and %q. Traditionally, these forms have all created strings that can be modified.

What are symbols in Ruby?

Ruby symbols are defined as “scalar value objects used as identifiers, mapping immutable strings to fixed internal values.” Essentially what this means is that symbols are immutable strings. In programming, an immutable object is something that cannot be changed.


1 Answers

The solution is to change the definition of the string literal to enforce its encoding. There are a few possible options to do this:

Use Array#pack (all versions of Ruby):

expected = ["d19b86"].pack('H*')

Use String#b (Ruby >= 2.0 only):

expected = "\xD1\x9B\x86".b

Use String#force_encoding (Ruby >= 1.9 only):

expected = "\xD1\x9B\x86".force_encoding("ASCII-8BIT")
like image 93
robinst Avatar answered Sep 18 '22 23:09

robinst