I'm trying to encode a string using the Crockford Base32 Algorithm.
Unfortunately, my current code only accepts numeric values as input. I thought of converting the ASCII characters to Decimal or Octal, but then the concatenation of 010
and 100
results in 10100
which makes it impossible to decode this. Is there some way to do this I am not aware of?
I believe this should be a more efficient implementation of Crockford Base32 encoding:
function crockford_encode( $base10 ) {
return strtr( base_convert( $base10, 10, 32 ),
"abcdefghijklmnopqrstuv",
"ABCDEFGHJKMNPQRSTVWXYZ" );
}
function crockford_decode( $base32 ) {
$base32 = strtr( strtoupper( $base32 ),
"ABCDEFGHJKMNPQRSTVWXYZILO",
"abcdefghijklmnopqrstuv110" );
return base_convert( $base32, 32, 10 );
}
(demo on codepad.org)
Note that, due to known limitations (or, arguably, bugs) in PHP's base_convert()
function, these functions will only return correct results for values that can be accurately represented by PHP's internal numeric type (probably double). We can hope that this will be fixed in some future PHP version, but in the mean time, you could always use this drop-in replacement for base_convert()
.
Edit: The easiest way to compute the optional check digit is probably simply like this:
function crockford_check( $base10 ) {
return substr( "0123456789ABCDEFGHJKMNPQRSTVWXYZ*~$=U", $base10 % 37, 1 );
}
or, for large numbers:
function crockford_check( $base10 ) {
return substr( "0123456789ABCDEFGHJKMNPQRSTVWXYZ*~$=U", bcmod( $base10, 37 ), 1 );
}
We can then use it like this:
function crockford_encode_check( $base10 ) {
return crockford_encode( $base10 ) . crockford_check( $base10 );
}
function crockford_decode_check( $base32 ) {
$base10 = crockford_decode( substr( $base32, 0, -1 ) );
if ( strtoupper( substr( $base32, -1 ) ) != crockford_check( $base10 ) ) {
return null; // wrong checksum
}
return $base10;
}
(demo on codepad.org)
Note: (July 18, 2014) The original version of the code above had a bug in the Crockford alphabet strings, such that they read ...WZYZ
instead of ...WXYZ
, causing some numbers to be encoded and decoded incorrectly. This bug has now been fixed, and the codepad.org versions now include a basic self-test routine to verify this. Thanks to James Firth for spotting the bug and fixing it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With