Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between Encoding::BINARY and Encoding::ASCII-8BIT?

Tags:

ruby

encoding

Ruby says that Encoding::BINARY and Encoding::ASCII-8BIT are the same.

Encoding::BINARY == Encoding::ASCII_8BIT
#=> true

We explicitly create a binary string and ruby still says it's ASCII_8BIT.

String.new("ABC", encoding: Encoding::BINARY).encoding
#=> #<Encoding:ASCII-8BIT>

Likewise, force_encoding cannot create a BINARY, it just creates an ASCII-8BIT string.

It seems that BINARY is simply an alias for ASCII-8BIT. Are there any differences?

like image 814
Jared Beck Avatar asked Jan 15 '18 01:01

Jared Beck


People also ask

What is 8-bit encoding?

8-bit clean is an attribute of computer systems, communication channels, and other devices and software, that handle 8-bit character encodings correctly. Such encoding include the ISO 8859 series and the UTF-8 encoding of Unicode.

What is the ASCII encoding of binary data?

ASCII encodes characters into seven bits of binary data. Since each bit can either be a 1 or a 0, that gives a total of 128 possible combinations. Each of these binary numbers can be converted to denary number from 0 through to 127. For example 1000001 in binary equals 65 in denary.

What is encoding binary file?

Binary encoding is a procedure to convert data to a form that is easily used by difference computer operating systems. This achieved by converting binary data to a ASCII string format, specifically, converting 8-bit data into a 7-bit format, that use as standard set of ASCII printable characters.

What is bit encoding?

Binary encoding uses the binary digit, or bit, as the fundamental unit of information, and a bit may only be a '0' or a '1' (only two possibilities since it is a binary-encoded system). By combining bits, numbers larger than 0 or 1 may be represented, and these bit collections are called words.


1 Answers

Your observation is correct: BINARY and ASCII-8BIT are indeed aliases and being an alias implies there are no differences as it's just another name for the same encoding, method, etc.

Looking at the source code is the most reliable way to confirm this. CRuby's character encodings can be found in the enc directory. The ASCII-8BIT encoding is defined in the ascii.c file containing the following line (in 2.5.0, it's line 61):

ENC_ALIAS("BINARY", "ASCII-8BIT")

ENC_ALIAS works like Ruby's alias keyword (alias, original name).

Confirming that BINARY or another encoding name is an alias can be done in pure Ruby too. One possibility is calling the Encoding.aliases method which returns a hash (alias => original):

Encoding.aliases['BINARY']  # => "ASCII-8BIT"

Other useful methods are Encoding#name which returns the original name and Encoding#names which also returns all aliases:

Encoding::BINARY.names    # => ["ASCII-8BIT", "BINARY"]
Encoding::US_ASCII.names  # => ["US-ASCII", "ASCII", "ANSI_X3.4-1968", "646"]

Or a way without any Encoding methods:

Encoding::BINARY.equal?(Encoding::ASCII_8BIT)

As the == method is often overwritten and may return true even if both operands are two different objects, BasicObject#equal? should be called to check if they are the same object. E.g. 1 and 1.0 have the same value (== returns true) but not the same object identity (equal? returns false).

like image 200
cremno Avatar answered Sep 20 '22 15:09

cremno