Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python converting from base64 to binary

I have a problem about converting a base64 encoded string into binary. I am collecting the Fingerprint2D in the following link,

url = "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/108770/property/Fingerprint2D/xml"

Fingerprint2D=AAADccB6OAAAAAAAAAAAAAAAAAAAAAAAAAA8WIEAAAAAAACxAAAAHgAACAAADAzBmAQwzoMABgCI AiTSSACCCAAhIAAAiAEMTMgMJibMsZuGeijn4BnI+YeQ0OMOKAACAgAKAABQAAQEABQAAAAAAAAA AA==

The descriptiong in the Pubchem says that this is 115 byte string, and it should be 920 bits when converted into binary. I try to convert it to the binary with the following,

    response = requests.get(url)
    tree = ET.fromstring(response.text)

    for el in tree[0]:
        if "Fingerprint2D" in el.tag:
            fpp = bin(int(el.text, 16))
            print(len(fpp))

If I use the code above, I'm getting the following error, "Value error: invalid literal for int() with base16:

And if I use the code below, length of fpp (binary) is equal to 1278 which is not what I expected.

    response = requests.get(url)
    tree = ET.fromstring(response.text)

    for el in tree[0]:
        if "Fingerprint2D" in el.tag:
            fpp = bin(int(hexlify(el.text), 16))
            print(len(fpp))

Thanks a lot already!!

like image 750
patti_jane Avatar asked Apr 04 '17 12:04

patti_jane


People also ask

How do I decode base64 in Python?

To decode an image using Python, we simply use the base64. b64decode(s) function. Python mentions the following regarding this function: Decode the Base64 encoded bytes-like object or ASCII string s and return the decoded bytes.

What is b64encode Python?

b64encode (s, altchars=None) Encode the bytes-like object s using Base64 and return the encoded bytes . Optional altchars must be a bytes-like object of at least length 2 (additional characters are ignored) which specifies an alternative alphabet for the + and / characters.

Can base64 be decoded?

Base64 Decode is very unique tool to decode base64 data to plain text. This tool saves your time and helps to decode base64 data. This tool allows loading the Base64 data URL, which loads base64 encoded text and decodes to human readable text.


1 Answers

To decode base64 format you need to pass a bytes object to the base64.decodebytes function:

import base64

t = "AAADccB6OAAAAAAAAAAAAAAAAAAAAAAAAAA8WIEAAAAAAACxAAAAHgAACAAADAzBmAQwzoMABgCI AiTSSACCCAAhIAAAiAEMTMgMJibMsZuGeijn4BnI+YeQ0OMOKAACAgAKAABQAAQEABQAAAAAAAAA AA==".encode("ascii")

decoded = base64.decodebytes(t)

print(decoded)
print(len(decoded)*8)

I get the following:

b'\x00\x00\x03q\xc0z8\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00<X\x81\x00\x00\x00\x00\x00\x00\xb1\x00\x00\x00\x1e\x00\x00\x08\x00\x00\x0c\x0c\xc1\x98\x040\xce\x83\x00\x06\x00\x88\x02$\xd2H\x00\x82\x08\x00! \x00\x00\x88\x01\x0cL\xc8\x0c&&\xcc\xb1\x9b\x86z(\xe7\xe0\x19\xc8\xf9\x87\x90\xd0\xe3\x0e(\x00\x02\x02\x00\n\x00\x00P\x00\x04\x04\x00\x14\x00\x00\x00\x00\x00\x00\x00\x00'
920

So 920 bits as expected.

To get data as binary just iterate on the bytes and convert to binary using format and zero-padding to 8 digits (bin adds a 0b header so it's not suitable), and join the strings together:

print("".join(["{:08b}".format(x) for x in decoded]))

results in:

00000000000000000000001101110001110000000111101000111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011110001011000100000010000000000000000000000000000000000000000000000001011000100000000000000000000000000011110000000000000000000001000000000000000000000001100000011001100000110011000000001000011000011001110100000110000000000000110000000001000100000000010001001001101001001001000000000001000001000001000000000000010000100100000000000000000000010001000000000010000110001001100110010000000110000100110001001101100110010110001100110111000011001111010001010001110011111100000000110011100100011111001100001111001000011010000111000110000111000101000000000000000001000000010000000000000101000000000000000000101000000000000000001000000010000000000000101000000000000000000000000000000000000000000000000000000000000000000

(which is 920 chars, as expected)

like image 137
Jean-François Fabre Avatar answered Sep 29 '22 13:09

Jean-François Fabre