I've come across some code which has the bit masks 0xff
and 0xff00
or in 16 bit binary form 00000000 11111111
and 11111111 00000000
.
/**
* Function to check if the given string is in GZIP Format.
*
* @param inString String to check.
* @return True if GZIP Compressed otherwise false.
*/
public static boolean isStringCompressed(String inString)
{
try
{
byte[] bytes = inString.getBytes("ISO-8859-1");
int gzipHeader = ((int) bytes[0] & 0xff)
| ((bytes[1] << 8) & 0xff00);
return GZIPInputStream.GZIP_MAGIC == gzipHeader;
} catch (Exception e)
{
return false;
}
}
I'm trying to work out what the purpose of using these bit masks in this context (against a byte array). I can't see what difference it would make?
In the context of a GZip compressed string as this method seems to be written for the GZip magic number is 35615
, 8B1F
in Hex and 10001011 00011111
in binary.
Am I correct in thinking this swaps the bytes? So for example say my input string were \u001f\u008b
bytes[0] & 0xff00
bytes[0] = 1f = 00011111
& ff = 11111111
--------
= 00011111
bytes[1] << 8
bytes[1] = 8b = 10001011
<< 8 = 10001011 00000000
((bytes[1] << 8) & 0xff00)
= 10001011 00000000 & 0xff00
= 10001011 00000000
11111111 00000000 &
-------------------
10001011 00000000
So
00000000 00011111
10001011 00000000 |
-----------------
10001011 00011111 = 8B1F
To me it doesn't seem like the &
is doing anything to the original byte in both cases bytes[0] & 0xff
and (bytes[1] << 8) & 0xff00)
. What am I missing?
int gzipHeader = ((int) bytes[0] & 0xff) | ((bytes[1] << 8) & 0xff00);
The type byte
is Java is signed. If you cast a byte
to an int
, its sign will be extended. The & 0xff
is to mask out the 1
bits that you get from sign extension, effectively treating the byte
as if it is unsigned.
Likewise for 0xff00
, except that the byte is first shifted 8 bits to the left.
So, what this does is:
bytes[0]
, cast it to int
and mask out the sign-extended bits (treating the byte as if it is unsigned)int
, shift it left by 8 bits, and mask out the sign-extended bits|
Note that the shift left effectively swaps the bytes.
Apparently the purpose is to read the first word of bytes
and store them in gzipHeader
by suitable masking and shifting. More precisely, the first part masks out exactly the first byte while the second part masks out the second byte, already shifted by 8 bits. The |
combines both bit masks to an int
.
The resulting value is compared against the defined value GZIPInputStream.GZIP_MAGIC
to determine if the first two bytes are the defined beginning of data compressed with gzip.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With