Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Degree symbol differs from UTF-8 from unicode?

Tags:

unicode

utf-8

Why Degree symbol differs from UTF-8 from unicode?

According to: http://www.utf8-chartable.de/ and http://www.fileformat.info/info/unicode/char/b0/index.htm

unicode is B0 but UTF-8 is C2 B0 How come!??

like image 934
Muhammad Hewedy Avatar asked Jan 04 '12 18:01

Muhammad Hewedy


3 Answers

Unicode (UTF-16 and UTF-32) uses the code point 0x00B0 for that character. UTF-8 doesn't allow characters at values above 127 (0x007F), as the high bit of each byte is reserved to indicate that this particular character is actually a multi-byte one.

Basic 7-bit ASCII maps directly to the first 128 characters of UTF-8. Any characters whose values are above 127 decimal (7F hex) must be "escaped" by setting the high bit and adding 1 or more extra bytes to describe.

like image 25
Marc B Avatar answered Sep 21 '22 14:09

Marc B


UTF-8 is a way to encode UTF characters using variable number of bytes (the number of bytes depends on the code point).

Code points between U+0080 and U+07FF use the following 2-byte encoding:

110xxxxx 10xxxxxx

where x represent the bits of the code point being encoded.

Let's consider U+00B0. In binary, 0xB0 is 10110000. If one substitutes the bits into the above template, one gets:

 11000010 10110000

In hex, this is 0xC2 0xB0.

like image 125
NPE Avatar answered Sep 19 '22 14:09

NPE


UTF-8 is one encoding of Unicode. UTF-16 and UTF-32 are other encodings of Unicode.

Unicode defines a numeric value for each character; the degree symbol happens to be 0xB0, or 176 in decimal. Unicode does not define how those numeric values are represented.

UTF-8 encodes the value 0xB0 as two consecutive octets (bytes) with values 0xC2 0xB0.

UTF-16 encodes the same value either as 0x00 0xB0 or as 0xBo 0x00, depending on endianness.

UTF-32 encodes it as 0x00 0x00 0x00 0xB0 or as 0xB0 0x00 0x00 0x00, again depending on endianness (I suppose other orderings are possible).

like image 33
Keith Thompson Avatar answered Sep 20 '22 14:09

Keith Thompson