Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does a person go about learning Java? (convert byte array to hex string)

Tags:

java

md5

I know this sounds like a broad question but I can narrow it down with an example. I am VERY new at Java. For one of my "learning" projects, I wanted to create an in-house MD5 file hasher for us to use. I started off very simple by attempting to hash a string and then moving on to a file later. I created a file called MD5Hasher.java and wrote the following:

import java.security.*;
import java.io.*;
public class MD5Hasher{
    public static void main(String[] args){
        String myString = "Hello, World!";
        byte[] myBA = myString.getBytes();
        MessageDigest myMD;
        try{
            myMD = MessageDigest.getInstance("MD5");
            myMD.update(myBA);
            byte[] newBA = myMD.digest();
            String output = newBA.toString();
            System.out.println("The Answer Is: " + output);
        } catch(NoSuchAlgorithmException nsae){
            // print error here
        }
    }
}

I visited java.sun.com to view the javadocs for java.security to find out how to use MessageDigest class. After reading I knew that I had to use a "getInstance" method to get a usable MessageDigest object I could use. The Javadoc went on to say "The data is processed through it using the update methods." So I looked at the update methods and determined that I needed to use the one where I fed it a byte array of my string, so I added that part. The Javadoc went on to say "Once all the data to be updated has been updated, one of the digest methods should be called to complete the hash computation." I, again, looked at the methods and saw that digest returned a byte array, so I added that part. Then I used the "toString" method on the new byte array to get a string I could print. However, when I compiled and ran the code all that printed out was this:

The Answer Is: [B@4cb162d5

I have done some looking around here on StackOverflow and found some information here:

How can I generate an MD5 hash?

that gave the following example:

String plaintext = 'your text here';
MessageDigest m = MessageDigest.getInstance("MD5");
m.reset();
m.update(plaintext.getBytes());
byte[] digest = m.digest();
BigInteger bigInt = new BigInteger(1,digest);
String hashtext = bigInt.toString(16);
// Now we need to zero pad it if you actually want the full 32 chars.
while(hashtext.length() < 32 ){
    hashtext = "0"+hashtext;
}

It seems the only part I MAY be missing is the "BigInteger" part, but I'm not sure.

So, after all of this, I guess what I am asking is, how do you know to use the "BigInteger" part? I wrongly assumed that the "toString" method on my newBA object would convert it to a readable output, but I was, apparently, wrong. How is a person supposed to know which way to go in Java? I have a background in C so this Java thing seems pretty weird. Any advice on how I can get better without having to "cheat" by Googling how to do something all the time?

Thank you all for taking the time to read. :-)

like image 920
Brian Avatar asked Jul 13 '10 15:07

Brian


1 Answers

The key in this particular case is that you need to realize that bytes are not "human readable", but characters are. So you need to convert bytes to characters in a certain format. For arbitrary bytes like hashes, usually hexadecimal is been used as "human readable" format. Every byte is then to be converted to a 2-character hexadecimal string which you in turn concatenate together.

This is unrelated to the language you use. You just have to understand/realize how it works "under the hoods" in a language agnostic way. You have to understand what you have (a byte array) and what you want (a hexstring). The programming language is just a tool to achieve the desired result. You just google the "functional requirement" along with the programming language you'd like to use to achieve the requirement. E.g. "convert byte array to hex string in java".


That said, the code example you found is wrong. You should actually determine each byte inside a loop and test if it is less than 0x10 and then pad it with zero instead of only padding the zero depending on the length of the resulting string (which may not necessarily be caused by the first byte being less than 0x10!).

StringBuilder hex = new StringBuilder(bytes.length * 2);
for (byte b : bytes) {
    if ((b & 0xff) < 0x10) hex.append("0");
    hex.append(Integer.toHexString(b & 0xff));
}
String hexString = hex.toString();

Update as per the comments on the answer of @extraneon, using new BigInteger(byte[]) is also the wrong solution. This doesn't unsign the bytes. Bytes (as all primitive numbers) in Java are signed. They have a negative range. The byte in Java ranges from -128 to 127 while you want to have a range of 0 to 255 to get a proper hexstring. You basically just need to remove the sign to make them unsigned. The & 0xff in the above example does exactly that.

The hexstring as obtained from new BigInteger(bytes).toString(16) is NOT compatible with the result of all other hexstring producing MD5 generators the world is aware of. They will differ whenever you've a negative byte in the MD5 digest.

like image 160
BalusC Avatar answered Oct 11 '22 16:10

BalusC