I'm developing a program that needs to load and save data in external files, I have been searching for options and I have chosen to save the data in a binary file.
As I don't want that someone could edit the file easily, I thought about writing in the first line of the file, its md5 sum. In this case, if some data of the file is changed, the sum won't match the one of the first line.
The problem I find then is that if I calculate the MD5, and after that I write the info inside the file, it's obvious that the sum will be different, so, how could I sort this?
If you sugest me a better option than the sum, it will be equally accepted.
Thanks in advance.
Hash Generator is the perfect tool to get the MD5 hash of a file This is another tool, which is used to generate different types of checksum or hashes: MD5 & SHA Checksum Utility.
On Windows 10, a MD5 checksum can be done natively with PowerShell, by using the Get-FileHash cmdlet. Open the powershell app and use the command syntax: “Get-FileHash <filename> -Algorithm MD5” to get the corresponding checksum hash.
If the downloaded file comes with a MD5 file, you can open it on Windows with any text editor. Double-click on the file and choose an app in the suggested list (Notepad, for example). Inside, you’ll find the MD5 hash and the file name.
Even if MD5 is no longer safe to use for encryption, it’s still an excellent solution to quickly check if a file transfer has been successful or not. The idea is to get the MD5 fingerprint of the file before and after the transfer. If it’s the same value, the file transfer is OK, if not the file is corrupted.
What is your threat model?
If you just want to protect against casual fiddling, md5 the main data of the file, then write the md5 sum to the end. To validate, strip off the md5 sum, then md5 only the original file.
If you want to protect against malicious and skilled cracking, you're out of luck; any validation algorithm you use can be replicated, particularly if they have access to the program itself. Even a cryptographic signature could fail if the attacker extracts the key from the program binary.
If it's a big deal, a unix solution is to run as setuid
or setgid
to a different user and write to a directory which users cannot modify. I'm not sure what a good general Java solution is, but the point remains: users shouldn't be able to modify your data because they were prevented from doing so, not because they were detected trying to.
While it is theoretically possible to make a self-referencing MD5 file (and I recall some have been found), it's a waste of resources. It is generally necessary to store the hash somewhere outside the hashed file (traditionally named md5sums
or sha1sums
, respectively).
This said, I'd recommend going for SHA-1 in addition to MD5.
Bill: Ted, while I agree that, in time, our band will be most triumphant. The truth is, Wyld Stallyns will never be a super band until we have Eddie Van Halen on guitar.
Ted: Yes, Bill. But, I do not believe we will get Eddie Van Halen until we have a triumphant video.
Bill: Ted, it's pointless to have a triumphant video before we even have decent instruments.
Ted: Well, how can we have decent instruments when we don't really even know how to play?
Bill: That is why we NEED Eddie Van Halen!
Ted: And THAT is why we need a triumphant video.
Bill, Ted: EXCELLENT!
Seriously, you can't calculate the MD5 sum (or some other hash) with the calculated hash embedded, so you have to store the hash somewhere else.
If you just don't want people to easily mess with the file, maybe it's an option to obfuscate it via ROT13 or XOR "encryption" ?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With