I'm doing a bit of reading on hashing for passwords. I've seen that SHA-256 > MD5. This got me thinking about how an app may deal with changing from one hashing function to another. What happens if someone implements an app that hashes their passwords using MD5. They then decide that SHA-256 is the way to go. But of course the password hashes stored in the database are in MD5.
What is the process for migrating the data in the database from one hashing function to another?
Hashing is a one-way function - it is specifically designed NOT to be reversible! You CANNOT un-hash a hashed value.
Hashes are lossy, regardless how much MB you feed in, the output will always of the same size. So there cannot be an mathematical exact way to get back the original value, there are infinite possible files resulting in the same hash.
Hashed data maps the original string of characters to data of a fixed length.
Consistent hashing is also used for partitioning in Amazon's Dynamo storage system, by the Riak key-value database, and as part of the Akamai Content Delivery Network.
It is not possible to "unhash" passwords (at least not in a general, efficient and reliable way -- you can guess some passwords, that's what attackers do, and you want to migrate from MD5 precisely because attackers may have some success at it). So the migration will be spread over time: some passwords will be hashed with MD5, other with SHA-256. When a password is to be verified:
Thus, passwords are migrated dynamically; to get totally rid of MD5, you have to wait a long time and/or destroy accounts which have not been accessed for a long time. You need to be able to distinguish a MD5 hash from a SHA-256 hash, which is easy since they have distinct sizes (16 bytes for MD5, 32 bytes for SHA-256). YOu could also add a flag or any other similar gimmick.
Please note that hashing passwords with a raw single application of a hash function is a pretty lousy way of doing it, security-wise, and replacing MD5 with SHA-256 will not really improve things. You hash passwords so that an attacker who gains read access to the database will not learn the passwords themselves. To really prevent the attacker from guessing the passwords, you also need "salts" (per-password random data, stored alongside the hashed password) and a suitably slow hash function (i.e. thousands, possibly millions, of nested hash function invocations). See this answer for details. The short answer: since you are envisioning migration, do the smart thing and migrate to bcrypt, not SHA-256 (see that answer on security.stackexchange).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With