I'm a bit of a newbie in Java and I trying to perform a MAC calculation on a file. Now since the size of the file is not known at runtime, I can't just load all of the file in to memory. So I wrote the code so it would read in bits (4k in this case). The issue I'm having is I tried loading the entire file into memory to see if both methods produce the same hash. However they seem to be producing different hashes
Here's the bit by bit code:
FileInputStream fis = new FileInputStream("sbs.dat");
byte[] file = new byte[4096];
m = Mac.getInstance("HmacSHA1");
int i=fis.read(file);
m.init(key);
while (i != -1)
{
m.update(file);
i=fis.read(file);
}
mac = m.doFinal();
And here's the all at once approach:
File f = new File("sbs.dat");
long size = f.length();
byte[] file = new byte[(int) size];
fis.read(file);
m = Mac.getInstance("HmacSHA1");
m.init(key);
m.update(file);
mac = m.doFinal();
Shouldn't they both produce the same hash?
The question however is more generic. Is the 1st code the correct way of loading a file into memory into pieces and perform whatever we want to do inside the while cycle? (socket send, cipher a file, etc...). This question is useful because every tutorial I've seen just loads everything at once...
Update: Working :-D. Will this approach work properly sending a file in pieces through a socket?
No. You have no guarantee that in fis.read(file)
will read file.length
bytes. This is why read()
is returning an int to tell you how many bytes it has actually read.
You should instead do this:
m.init(key);
int i=fis.read(file);
while (i != -1)
{
m.update(file, 0, i);
i=fis.read(file);
}
taking advantage of Mac.update(byte[] data, int offset, int len) method that allows you to specify length of actual data in in byte[] array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With