Bit shifting and bit mask - sample code

Question

I've come across some code which has the bit masks 0xff and 0xff00 or in 16 bit binary form 00000000 11111111 and 11111111 00000000.

/**
 * Function to check if the given string is in GZIP Format.
 *
 * @param inString String to check.
 * @return True if GZIP Compressed otherwise false.
 */
public static boolean isStringCompressed(String inString)
{
    try
    {
        byte[] bytes = inString.getBytes("ISO-8859-1");
        int gzipHeader = ((int) bytes[0] & 0xff)
            | ((bytes[1] << 8) & 0xff00);
        return GZIPInputStream.GZIP_MAGIC == gzipHeader;
    } catch (Exception e)
    {
        return false;
    }
}

I'm trying to work out what the purpose of using these bit masks in this context (against a byte array). I can't see what difference it would make?

In the context of a GZip compressed string as this method seems to be written for the GZip magic number is 35615, 8B1F in Hex and 10001011 00011111 in binary.

Am I correct in thinking this swaps the bytes? So for example say my input string were \u001f\u008b

bytes[0] & 0xff00
 bytes[0] = 1f = 00011111
          & ff = 11111111
                 --------
               = 00011111

bytes[1] << 8
 bytes[1] = 8b = 10001011
          << 8 = 10001011 00000000

((bytes[1] << 8) & 0xff00)
= 10001011 00000000 & 0xff00
= 10001011 00000000 
  11111111 00000000 &
-------------------
  10001011 00000000

So

00000000 00011111
10001011 00000000 |
-----------------
10001011 00011111 = 8B1F

To me it doesn't seem like the & is doing anything to the original byte in both cases bytes[0] & 0xff and (bytes[1] << 8) & 0xff00). What am I missing?

Jesper · Accepted Answer

int gzipHeader = ((int) bytes[0] & 0xff) | ((bytes[1] << 8) & 0xff00);

The type byte is Java is signed. If you cast a byte to an int, its sign will be extended. The & 0xff is to mask out the 1 bits that you get from sign extension, effectively treating the byte as if it is unsigned.

Likewise for 0xff00, except that the byte is first shifted 8 bits to the left.

So, what this does is:

take the first byte, bytes[0], cast it to int and mask out the sign-extended bits (treating the byte as if it is unsigned)
take the second byte, cast it to int, shift it left by 8 bits, and mask out the sign-extended bits
combine the values with |

Note that the shift left effectively swaps the bytes.

Codor · Answer

Apparently the purpose is to read the first word of bytes and store them in gzipHeader by suitable masking and shifting. More precisely, the first part masks out exactly the first byte while the second part masks out the second byte, already shifted by 8 bits. The | combines both bit masks to an int.

The resulting value is compared against the defined value GZIPInputStream.GZIP_MAGIC to determine if the first two bytes are the defined beginning of data compressed with gzip.

Bit shifting and bit mask - sample code

Tags:

java

bit-manipulation

PDStat

2 Answers

Jesper

Codor

Recent Activity

Donate For Us

Bit shifting and bit mask - sample code

Tags:

java

bit-manipulation

PDStat

2 Answers

Jesper

Codor

Related questions

Recent Activity

Donate For Us