Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encoding and decoding in Python with MD5( )

Running this code on Ubuntu 10.10 in Python 3.1.1

I am getting the following error:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xd3 in position 0: invalid continuation byte

And the position of the error changes depending on when I run the following code: (not the real keys or secret)

sandboxAPIKey = "wed23hf5yxkbmvr9jsw323lkv5g"
sandboxSharedSecret = "98HsIjh39z"

def buildAuthParams():
    authHash = hashlib.md5();

    #encoding because the update on md5() needs a binary rep of the string
    temp = str.encode(sandboxAPIKey + sandboxSharedSecret + repr(int(time.time())))
    print(temp)

    authHash.update(temp)

    #look at the string representation of the binary digest
    print(authHash.digest())

    #now I want to look at the string representation of the digest
    print(bytes.decode(authHash.digest()))

Here is the output of a run (with the sig and key information changed from the real output)

b'sdwe5yxkwewvr9j343434385gkbH4343h4343dz129443643474'
b'\x945EM3\xf5\xa6\xf6\x92\xd1\r\xa5K\xa3IO'

print(bytes.decode(authHash.digest()))
UnicodeDecodeError: 'utf8' codec can't decode byte 0x94 in position 0: invalid start byte

I am assuming I am not getting something right with my call to decode but I can not figure out what it is. The print of the authHash.digest looks like valid to me.

I would really appreciate any ideas on how to get this to work

like image 320
TheSteve0 Avatar asked May 08 '26 02:05

TheSteve0


1 Answers

When you try to decode a bytearray into a string it tries to match sequentially the bytes to valid characters of an encoding set(by default, utf-8), the exception is being raised because it can't match a sequence of bytes to a valid character in the utf-8 alphabet.

The same will happen if you try to decode it using ascii, any value greater than 127 is an invalid ascii character.

So, if you are trying to get a printable version of the md5 hash, you should hexdigest it, this is the standard way of printing any type of hash, each byte is represented by 2 hexadecimal digits.

In order to do this you can use:

authHash.hexdigest()

If you need to use it in a url, you probably need to encode the bytearray into base64:

base64.b64encode(authHash.digest())
like image 136
Santiago Alessandri Avatar answered May 09 '26 15:05

Santiago Alessandri