MD5 and SHA-1 hashes have weaknesses against collision attacks. SHA256 does not but it outputs 256 bits. Can I safely take the first or last 128 bits and use that as the hash? I know it will be weaker (because it has less bits) but otherwise will it work?
Basically I want to use this to uniquely identify files in a file system that might one day contain a trillion files. I'm aware of the birthday problem and a 128 bit hash should yield about a 1 in a trillion chance on a trillion files that there would be two different files with the same hash. I can live with those odds.
What I can't live with is if somebody could easily, deliberately, insert a new file with the same hash and the same beginning characters of the file. I believe in MD5 and SHA1 this is possible.
Your application of a truncated SHA256, however, is not safe.
As far as truncating a hash goes, that's fine. It's explicitly endorsed by the NIST, and there are hash functions in the SHA-2 family that are simple truncated variants of their full brethren: SHA-256/224, SHA-512/224, SHA-512/256, and SHA-512/384, where SHA-x/y denotes a full-length SHA-x truncated to y bits.
SHA-256 generates an almost-unique 256-bit (32-byte) signature for a text. See below for the source code. A hash is not 'encryption' – it cannot be decrypted back to the original text (it is a 'one-way' cryptographic function, and is a fixed size for any size of source text).
As of 2021 technology, the chance of solving a hash with SHA256 algorithm, that is, converting it to the main input, is very very low possibility. Because a lot of mathematical (combination) processing, CPU-GPU power and electrical energy are required to solve a hash.
Yeah that will work. Theoretically it's better to XOR the two halves together but even truncated SHA256 is stronger than MD5. You should still consider the result a 128 bit hash rather than a 256 bit hash though.
My particular recommendation in this particular case is to store and reference using HASH + uniquifier where uniquifier is the count of how many distinct files you've seen with this hash before. This way you don't absolutely fall down flat if somebody tries to store future discovered collision vectors for SHA256.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With