Every TOTP implementation (even FreeOTP by RedHat) I find uses Base32 encoding/decoding for it's generated secret. Why is Base64 not used, since Base32 uses roughly 20 % more space and its main advantage is, that it is more human-readable? It is not shown to the user for generation anyways.
While every comment within the implementation says, that its implementation follows RFC6238 / RFC4226, I cannot find anything being said about Base32 within the RFC documents.
It obviously makes sense for it to be converted to either Base32 or Base64 because of data safety through transportation, but why not just use Base64 then?
Lower base = more digits to represent the same number. In this case, base64 is 5/6 the size of base32, because base32 gets you log2(32)=5 bits per character, while base64 gets you log2(64)=6 bits per character.
Base32 is the base-32 numeral system. It uses a set of 32 digits, each of which can be represented by 5 bits (25). One way to represent Base32 numbers in a human-readable way is by using a standard 32-character set, such as the twenty-two upper-case letters A–V and the digits 0-9.
Base32 is an encoding method that uses printable ASCII characters. In Base32, data is divided into 5 bits and converted into alphanumeric characters (A-Z, 2-7). Converts every 8 characters, and if the last is less than 8 characters, pad with the equal symbol (=).
The reason Base32 is used is to avoid human error. It has nothing to do with space. The reason Base32 is not mentioned in RFC4226 is because it has nothing to do with private key and HMAC and token generation. Base32 is only used to deliver the private key in a human readable form to a human.
More details if interested.
The private key in TOTP should be a 20-byte (160-bit) secret. The private key is used with HMAC-SHA1 to encode the epoch time counter. A token is extracted from the genetated 160-bit HMAC.
BUT to enter this secret into a tool like Google Authenticator is not easy. Ok, there is the option of a QR code that collects this private key from a website, but this feature isn't always available.
So when you have to enter this private key, you share with the user in Base32 format, i.e. the key is encoded as to produce a Base32 string.
So why is Base32 better than Base64 in this case.
One important and simple reason and why Base32 even exists is that it uses A-Z uppercase only (no lowercase) and the numbers 2-7. No 0189. 26 + 6 chars = 32.
There are no lowercase letters and no digits 0189 so "i" "l" "I" and "1" are not confused. There is only I. Confusion between B and 8, and 0 and O is also eliminated.
If 0 was entered, it can be treated as a O. A 1 as a I etc.
Whether the tools attempts to auto correct, or my preference, just tell user invalid entry is a matter for taste. But what is clear, the human error non-unique interpretation of the string is reduced significantly.
This is not the case with Base64.
All of the above issues with upper and lowercase and numbers being confused all apply to Base64.
I believe it's just a historical context. Someone at the beginning chose Base32, a tool became popular and descendants use the same encoding to comply.
I also saw a lot of implementation using hex format, and examples in the provided RFC6238 use hex as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With