This is a bit of a soft question, feel free to let me know if there's a better place for this.
I'm developing some code that accepts a password that requires international characters - so I'll need to compare an input unicode string with a stored unicode string. Easy enough.
My question is this - do users of international character sets generally expect normalization in such a case? My Google searches show some conflicts in opinion from 'always do it' (http://unicode.org/faq/normalization.html) to 'don't bother'. Are there any pros/cons to not normalizing? (i.e., less likely to able guess a password, etc.)
I would recommend that if your password field accepts Unicode input (presumably UTF-8 or UTF-16), that you normalize it before hashing and comparing. If you don't normalize it, and people access it from different systems (different operating systems, or different browsers if it's a web app, or with different locales), then you may get the same password represented with different normalization. This would mean that your user would type the correct password, but have it rejected, and it would not be obvious why, nor would they have any way to fix it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With