I recently started looking at MD5 hashing (in Java) and while I've found algorithms and methods to help me accomplish that, I'm left wondering how it actually works.
For one, I found the following from this URL:
private static String convertToHex(byte[] data) {
StringBuffer buf = new StringBuffer();
for (int i = 0; i < data.length; i++) {
int halfbyte = (data[i] >>> 4) & 0x0F;
int two_halfs = 0;
do {
if ((0 <= halfbyte) && (halfbyte <= 9))
buf.append((char) ('0' + halfbyte));
else
buf.append((char) ('a' + (halfbyte - 10)));
halfbyte = data[i] & 0x0F;
} while(two_halfs++ < 1);
}
return buf.toString();
}
I haven't found any need to use bit-shifting in Java so I'm a bit rusty on that. Someone kind enough to illustrate (in simple terms) how exactly does the above code does the conversion? ">>>"?
I also found other solutions on StackOverflow, such as here and here, which uses BigInteger instead:
try {
String s = "TEST STRING";
MessageDigest md5 = MessageDigest.getInstance("MD5");
md5.update(s.getBytes(),0,s.length());
String signature = new BigInteger(1,md5.digest()).toString(16);
System.out.println("Signature: "+signature);
} catch (final NoSuchAlgorithmException e) {
e.printStackTrace();
}
Why does that work too, and which way is more efficient?
Thanks for your time.
private static String convertToHex(byte[] data) {
StringBuffer buf = new StringBuffer();
for (int i = 0; i < data.length; i++) {
Up till this point ... just basic set up and starting a loop to go through all bytes in the array
int halfbyte = (data[i] >>> 4) & 0x0F;
bytes when converted to hex are two hex digits or 8 binary digits depending on what base you look at it in. The above statement shifts the high 4 bits down (>>> is unsigned right shift) and logical ANDs it with 0000 1111 so that the result is an integer equal to the high 4 bits of the byte (first hex digit).
Say 23 was an input, this is 0001 0111 in binary. The shift makes and logical AND coverts this to 0000 0001.
int two_halfs = 0;
do {
This just sets up the do/while loop to run twice
if ((0 <= halfbyte) && (halfbyte <= 9))
buf.append((char) ('0' + halfbyte));
else
buf.append((char) ('a' + (halfbyte - 10)));
Here we're displaying the actual hex digit, basically just using the zero or a character as a starting point and shifting up to the correct character. The first if statement covers all the digits 0-9, and the second covers all digits 10-15 (a-f in hex)
Again, using our example 0000 0001 in decimal is equal to 1. We get caught in the upper if block and add 1 to the '0' character to get the character '1', append that to the string and move on.
halfbyte = data[i] & 0x0F;
Now we set up the integer to just equal the low bits from the byte and repeat.
Again, if our input was 23 ... 0001 0111 after the logical AND becomes just 0000 0111 which is 7 in decimal. Repeat the same logic as above and the character '7' is displayed.
} while(two_halfs++ < 1);
Now we just move on to the next byte in the array and repeat.
}
return buf.toString();
}
To answer your next question, the Java API already has a base conversion utility built in to BigInteger already. See the toString(int radix) documentation.
Not knowing the implementation used by the Java API, I can't say for sure, but I'd be willing to bet that the Java implenentation is more efficient than the first somewhat simple algorithm you posted.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With