I was reading a post in Troy Hunt's blog (https://www.troyhunt.com/ive-just-launched-pwned-passwords-version-2/), about a feature called "Pwned Passwords" that checks if your password is in a database with more than 1 billion leaked passwords.
To do this check without passing your password, the client code hash it and pass just the first five chars of this hash, the backend returns all the sha1 hashes of the passwords that starts with the prefix that you passed. Then, to check if the hash of your password is in the database or not, the comparison is made on client code.
And he put some info about the data of these hashed passwords...
- Every hash prefix from 00000 to FFFFF is populated with data (16^5 combinations)
- The average number of hashes returned is 478
- The smallest is 381 (hash prefixes "E0812" and "E613D")
- The largest is 584 (hash prefixes "00000" and "4A4E8")
In the comments, people was wondering if the presence of this "00000" is a coincidence or is math...
Could someone that understands the SHA1 algorithm explain it to us?
In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecimal digits.
Using Basic Password Hashing Password hashing add a layer of security. Hashing allows passwords to be stored in a format that can't be reversed at any reasonable amount of time or cost for a hacker. Hashing algorithms turn the plaintext password into an output of characters of a fixed length.
Password hashing is defined as putting a password through a hashing algorithm (bcrypt, SHA, etc) to turn plaintext into an unintelligible series of numbers and letters. This is important for basic security hygiene because, in the event of a security breach, any compromised passwords are unintelligible to the bad actor.
Well, since the passwords originally come from data breaches, my best guess is that the password table in one of the breached systems was sorted or clustered by the (unsalted -- those are the kind of folks who get their passwords stolen) SHA1 hash of the password. When the system was breached, the attackers started with the "00000" hashes and just didn't make it all the way through...
Or maybe the list that Troy used includes the first part of an SHA1 rainbow table (https://en.wikipedia.org/wiki/Rainbow_table)...
Or something like that. The basic idea is that the SHA1 hash of the passwords was part of the password selection process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With