I was looking at the Javadoc for the CharSequence interface, implemented by String
, StringBufer
and a few others, more specifically at the chars()
method, and the Javadoc says
Returns a stream of int zero-extending the char values from this sequence.
Now, I know that it returns int
values, and that int
that can be cast to char
. But what does "zero-extending" mean?
In a move or convert operation, zero extension refers to setting the high bits of the destination to zero, rather than setting them to a copy of the most significant bit of the source.
Sign extension is used for signed loads of bytes (8 bits using the lb instruction) and halfwords (16 bits using the lh instruction). Sign extension replicates the most significant bit loaded into the remaining bits. Zero extension is used for unsigned loads of bytes ( lbu ) and halfwords ( lhu ).
2) ANDI, ORI, XORI both use zero-extend. It clearly mentioned the 16-bit immediate is signed-extend.
Int is a 32-bit value, char is a 16-bit value. Zero-extend just means that the higher-order "unused" bits in the int are zeroes.
I am guessing it is documented because, looked at in one way, the operation treats the char
value as a 16-bit integer, and when converting from an integer to a larger integer, the user of a library method such as this must know how the sign is treated.
For those that don't know, a signed integer value reserves its highest-order bit as a 'sign bit'; if it is 1, the number is negative. When converting to a larger integer, if the highest-order bit is copied into all the 'extra' bits in the new value, we say the conversion is "sign extended". Only if this is done will the new integer have the same signed numeric value as the smaller signed integer. If the smaller integer value is unsigned, then the highest-order bit represents a value just like all the other bits, and only without sign extension will the values be the same.
The Java language does not have unsigned integer as a data type, so the conversion without sign extension could be regarded as unusual.
If one had a constant representing the (signed) integer value of a particular character, then after conversion to a 32-bit integer, the constant would only still be accurate if the conversion included sign extension. So it is important to know how any given conversion treats the original value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With