Sorry for this curiosity that I have.
sha1 use [a-f0-9]
chars for its hashing function. May I know why it doens't use all the chars possible [a-z0-9]
by using all chars availabe it could grealty increase the number of possibile different hash, thus lowering the probabilty of possibile collision.
If you don't think this is a real question, just leave a comment I will instantly delete this question.
===
As stated in the answer, sha1 does NOT
uses only 16 chars
. The correct fact is: sha1 is 160 bits of binary data (cit.). I have added this to prevent confusion.
In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecimal digits.
The length of the output or hash depends on the hashing algorithm you use. Hash values can be 160 bits for SHA-1 hashes, or 256 bits, 384 bits, or 512 bits for the SHA-2 family of hashes. They're typically displayed in hexadecimal characters.
SHA-1 is a legacy cryptographic hashing algorithm that is no longer deemed secure. Using the SHA-1 hashing algorithm in digital certificates could allow an attacker to spoof content, perform phishing attacks, or perform man-in-the-middle attacks.
The hash size for the MD5 algorithm is 128 bits. The ComputeHash methods of the MD5 class return the hash as an array of 16 bytes. Note that some MD5 implementations produce a 32-character, hexadecimal-formatted hash.
You're confusing representation with content.
sha1 is 160 bits of binary data. You can just as easily represent it with:
hex: 0xf1d2d2f924e986ac86fdf7b36c94bcdf32beec15
decimal: 1380568310619656533693587816107765069100751973397
binary: 1111000111010010110100101111100100100100111010011000011010101100100001101111110111110111101100110110110010010100101111001101111100110010101111101110110000010101
base 62: xufK3qj2bZgDrLA0XN0cLv1jZXc
There's nothing magical about hexidecimal. It's just very common mechanism for showing content that breaks easily along 4-bit boundaries.
The base 62
output is generated with this little bit of ruby:
#!/usr/bin/ruby
def chars_from_hex(s)
c = s % 62
s = s / 62
if ( s > 0 )
chars_from_hex(s)
end
if (c < 10)
print c
elsif (c < 36)
print "abcdefghijklmnopqrstuvwxyz"[c-11].chr()
elsif (c < 62)
print "ABCDEFGHIJKLMNOPQRSTUVWXYZ"[c-37].chr()
else
puts "error c", c
end
end
chars_from_hex(0xf1d2d2f924e986ac86fdf7b36c94bcdf32beec15)
It uses the standard idiom for converting from one base to another and treats 0-9
as 0-9, a-z
as 10-35, A-Z
as 36-61. It could be trivially extended to support more digits by including e.g. !@#$%^&*()-_=+\|[]{},.<>/?;:'"~`
if one so desired. (Or any of the vast array of Unicode codepoints.)
@yes123 asked about the ascii representation of the hash specifically, so here is the result of interpreting the 160-bit hash directly as ascii:
ñÒÒù$é¬ý÷³l¼ß2¾ì
It doesn't look like much because:
This base conversion can be practically useful, too; the Base64 encoding method uses 64 (instead of my 62) characters to represent 6 bits at a time; it needs two more characters for 'digits' and a character for padding. UUEncoding chose a different set of 'digits'. And a fellow stacker had a problem that was easily solved by changing the base of input numbers to output numbers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With