Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bits of a Character in Java

Tags:

java

io

utf

As I know a is an 8 bits character, â is a 16 bits character:

  1. How to know a character is 8 bits or 16 bits or higher?

  2. Why â character could not present at 8 bits?

  3. a or â just UI form, how do they look like in bits form?

  4. 97 is the code of a, how to calculate this number or it's just the ordinal number of character?

like image 683
Hoang Nguyen Avatar asked May 27 '26 09:05

Hoang Nguyen


1 Answers

As I know 'a' is an 8 bits character, 'â' is a 16 bits character.

Not really. Java char is an unsigned 16-bit type, so both 'a' and 'â' are 16-bit characters. It is true that 'a''s top 8 bits are set to zero, but these bits are there nevertheless. Same goes for 'â' (see below).

How to know a character is 8 bits or 16 bits or higher?

Compare ch & 0xFF00 to zero. If it is zero, the upper 8 bits are all zeros; otherwise, some of these eight bits are non-zeros.

Why 'â' character could not present at 8 bits?

It can be presented as using 8-bit: 'â''s code is 0xE2, or 226. It fits in 8 bits, but it does not fit in 7 bits. Here is a convenient table for looking up character codes.

'a' or 'â' just UI form, how do they look like in bits form?

Since char is an integral type, you can convert it to int and print them in binary, decimal, hex or other radix to see the bit patterns behind the character representations.

97 is the code of 'a', how to calculate this number or it's just the ordinal number of character?

Cast 'a' to an int:

int a = (int)'a';
like image 181
Sergey Kalinichenko Avatar answered May 30 '26 10:05

Sergey Kalinichenko