I'm writing a password salt/hash procedure for my .NET application, largely following the guide of this article: http://www.aspheute.com/english/20040105.asp
Basically the code for computing the salted hash is this:
public string ComputeSaltedHash(string password, byte[] salt) {
// Get password ASCII text as bytes:
byte[] passwordBytes = System.Text.Encoding.ASCII.GetBytes(password);
// Append the two arrays
byte[] toHash = new byte[passwordBytes.Length + salt.Length];
Array.Copy(passwordBytes, 0, toHash, 0, passwordBytes.Length);
Array.Copy(salt, 0, toHash, passwordBytes.Length, salt.Length);
byte[] computedHash = SHA1.Create().ComputeHash(toHash);
// Return as an ASCII string
return System.Text.Encoding.ASCII.GetString(computedHash);
}
However, I want to allow allow users to use Unicode chars in their password, if they like. (This seems like a good idea; can anyone think of a reason it's not?)
However, I don't know a ton about how Unicode works, and I'm worried if I just change both references of System.Text.Encoding.ASCII
to System.Text.Encoding.Unicode
, the hash algorithm might produce some byte combinations that don't form valid Unicode chars and the GetString call will freak out.
Is this a valid concern, or will it be OK?
Password hashes strengthened by using Unicode characters will require a significantly larger character set to ensure successful cracking and thus increase the complexity level of the process by a staggering amount. Well, that’s what you would hope…
Unicode – the future of passwords? Possibly… Or maybe: Unicode: How to make correcthorsebatterystaple in to an amazingly strong password Years ago, we failed miserably whilst trying to crack a local admin password with a large RainbowTable.
To integrate hashing in the password storage workflow, when the user is created, instead of storing the password in cleartext, we hash the password and store the username and hash pair in the database table. When the user logs in, we hash the password sent and compare it to the hash connected with the provided username.
Looking at this backwards, it was possible to determine that the complexity incurred in a brute force attack for each 1 Unicode character is about the same as 3 or more ASCII characters. So the complexity of a 3 character Unicode password is comparable to that of a 9 character ASCII password.
You shouldn't be using any normal encoding to convert from arbitrary binary data back to a string. It's not encoded text - it's just a sequence of bytes. Don't try to interpret it as if it were "normal" text. Whether the original password contains any non-ASCII characters is irrelevant to this - your current code is broken. (I would treat the linked article with a large dose of suspicion simply on that basis.)
I would suggest:
Encoding.UTF8
to get the bytes from the password. That will allow the password to contain any unicode character. Encoding.Unicode
would be fine here too.Convert.ToBase64String
to convert from the computed hash back to text. Base64 is specifically designed to represent opaque binary data in text within the ASCII character set.It's enough to change the first reference to Unicode
or UTF-8
. You may want to normalize the input, however, to account for various ways of entering accents and the like.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With