Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can someone explain the conversion from byte array to hex string?

I recently started looking at MD5 hashing (in Java) and while I've found algorithms and methods to help me accomplish that, I'm left wondering how it actually works.

For one, I found the following from this URL:

private static String convertToHex(byte[] data) {
    StringBuffer buf = new StringBuffer();
    for (int i = 0; i < data.length; i++) {
        int halfbyte = (data[i] >>> 4) & 0x0F;
        int two_halfs = 0;
        do {
            if ((0 <= halfbyte) && (halfbyte <= 9))
                buf.append((char) ('0' + halfbyte));
            else
                buf.append((char) ('a' + (halfbyte - 10)));
                halfbyte = data[i] & 0x0F;
            } while(two_halfs++ < 1);
        }
    return buf.toString();
}

I haven't found any need to use bit-shifting in Java so I'm a bit rusty on that. Someone kind enough to illustrate (in simple terms) how exactly does the above code does the conversion? ">>>"?

I also found other solutions on StackOverflow, such as here and here, which uses BigInteger instead:

try {
   String s = "TEST STRING";
   MessageDigest md5 = MessageDigest.getInstance("MD5");
   md5.update(s.getBytes(),0,s.length());
   String signature = new BigInteger(1,md5.digest()).toString(16);
   System.out.println("Signature: "+signature);

} catch (final NoSuchAlgorithmException e) {
   e.printStackTrace();
}

Why does that work too, and which way is more efficient?

Thanks for your time.

like image 835
aberrant80 Avatar asked Dec 13 '22 03:12

aberrant80


1 Answers

private static String convertToHex(byte[] data) {
    StringBuffer buf = new StringBuffer();
    for (int i = 0; i < data.length; i++) {

Up till this point ... just basic set up and starting a loop to go through all bytes in the array

        int halfbyte = (data[i] >>> 4) & 0x0F;

bytes when converted to hex are two hex digits or 8 binary digits depending on what base you look at it in. The above statement shifts the high 4 bits down (>>> is unsigned right shift) and logical ANDs it with 0000 1111 so that the result is an integer equal to the high 4 bits of the byte (first hex digit).

Say 23 was an input, this is 0001 0111 in binary. The shift makes and logical AND coverts this to 0000 0001.

        int two_halfs = 0;
        do {

This just sets up the do/while loop to run twice

            if ((0 <= halfbyte) && (halfbyte <= 9))
                buf.append((char) ('0' + halfbyte));
            else
                buf.append((char) ('a' + (halfbyte - 10)));

Here we're displaying the actual hex digit, basically just using the zero or a character as a starting point and shifting up to the correct character. The first if statement covers all the digits 0-9, and the second covers all digits 10-15 (a-f in hex)

Again, using our example 0000 0001 in decimal is equal to 1. We get caught in the upper if block and add 1 to the '0' character to get the character '1', append that to the string and move on.

                halfbyte = data[i] & 0x0F;

Now we set up the integer to just equal the low bits from the byte and repeat.

Again, if our input was 23 ... 0001 0111 after the logical AND becomes just 0000 0111 which is 7 in decimal. Repeat the same logic as above and the character '7' is displayed.

            } while(two_halfs++ < 1);

Now we just move on to the next byte in the array and repeat.

        }
    return buf.toString();
}

To answer your next question, the Java API already has a base conversion utility built in to BigInteger already. See the toString(int radix) documentation.

Not knowing the implementation used by the Java API, I can't say for sure, but I'd be willing to bet that the Java implenentation is more efficient than the first somewhat simple algorithm you posted.

like image 97
tschaible Avatar answered Dec 15 '22 18:12

tschaible