Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: base64.b64decode() vs .decode?

The Code Furies have turned their baleful glares upon me, and it's fallen to me to implement "Secure Transport" as defined by The Direct Project. Whether or not we internally use DNS rather than LDAP for sharing certificates, I'm obviously going to need to set up the former to test against, and that's what's got me stuck. Apparently, an X509 cert needs some massaging to be used in a CERT record, and I'm trying to work out how that's done.

The clearest thing I've found is a script on Videntity's blog, but not being versed in python, I'm hitting a stumbling block. Specifically, this line crashes:

decoded_clean_pk = clean_pk.decode('base64', strict)

since it doesn't seem to like (or rather, to know) whatever 'strict' is supposed to represent. I'm making the semi-educated guess that the line is supposed to decode the base64 data, but I learned from the Debian OpenSSL debacle some years back that blindly diddling with crypto-related code is a Bad Thing(TM).

So I turn the illustrious python wonks on SO to ask if that line might be replaced by this one (with the appropriate import added):

decoded_clean_pk = base64.b64decode(clean_pk)

The script runs after that change, and produces correct-looking output, but I've got enough instinct to know that I can't necessarily trust my instincts here. :)

like image 926
GeminiDomino Avatar asked Nov 21 '13 14:11

GeminiDomino


1 Answers

This line should've work if you would've called like this:

decoded_clean_pk = clean_pk.decode('base64', 'strict')

Notice that strict has to be a string, otherwise python interpreter would try to search for a variable named strict and if it didn't find it or otherwise has other value than: strict, ignore, and replace, it'll probably would've complain about it.

Take a look at this code:

>>>b=base64.b64encode('hello world')
>>>b.decode('base64')
'hello world'

>>>base64.b64decode(b)
'hello world'

Both decode and b64decode works the same when .decode is passed the base64 argument string.

The difference is that str.decode will take a string of bytes as arguments and will return it's Unicode representation depending on the encoding argument you pass as first parameter. In this case, you're telling it to handle a bas64 string so it will do it ok.

To answer your question, both works the same, although b64decode/encode are meant to work only with base64 encodings and str.decode can handle as many encodings as the library is aware of.

For further information take a read at both of the doc sections: decode and b64decode.

UPDATE: Actually, and this is the most important example I guess :) take a look at the source code for encodings/base64_codec.py which is that decode() uses:

def base64_decode(input,errors='strict'):

    """ Decodes the object input and returns a tuple (output
        object, length consumed).

        input must be an object which provides the bf_getreadbuf
        buffer slot. Python strings, buffer objects and memory
        mapped files are examples of objects providing this slot.

        errors defines the error handling to apply. It defaults to
        'strict' handling which is the only currently supported
        error handling for this codec.

    """
    assert errors == 'strict'
    output = base64.decodestring(input)
    return (output, len(input))

As you may see, it actually uses base64 module to do it :)

Hope this clarify in some way your question.

like image 196
Paulo Bu Avatar answered Sep 24 '22 03:09

Paulo Bu