Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to encode a long in Base64 in Python?

Tags:

python

base64

In Java, I can encode a BigInteger as:

java.math.BigInteger bi = new java.math.BigInteger("65537L");
String encoded = Base64.encodeBytes(bi.toByteArray(), Base64.ENCODE|Base64.DONT_GUNZIP);

// result: 65537L encodes as "AQAB" in Base64

byte[] decoded = Base64.decode(encoded, Base64.DECODE|Base64.DONT_GUNZIP);
java.math.BigInteger back = new java.math.BigInteger(decoded);

In C#:

System.Numerics.BigInteger bi = new System.Numerics.BigInteger("65537L");
string encoded = Convert.ToBase64(bi);
byte[] decoded = Convert.FromBase64String(encoded);
System.Numerics.BigInteger back = new System.Numerics.BigInteger(decoded);

How can I encode long integers in Python as Base64-encoded strings? What I've tried so far produces results different from implementations in other languages (so far I've tried in Java and C#), particularly it produces longer-length Base64-encoded strings.

import struct
encoded = struct.pack('I', (1<<16)+1).encode('base64')[:-1]
# produces a longer string, 'AQABAA==' instead of the expected 'AQAB'

When using this Python code to produce a Base64-encoded string, the resulting decoded integer in Java (for example) produces instead 16777472 in place of the expected 65537. Firstly, what am I missing?

Secondly, I have to figure out by hand what is the length format to use in struct.pack; and if I'm trying to encode a long number (greater than (1<<64)-1) the 'Q' format specification is too short to hold the representation. Does that mean that I have to do the representation by hand, or is there an undocumented format specifier for the struct.pack function? (I'm not compelled to use struct, but at first glance it seemed to do what I needed.)

like image 411
jbatista Avatar asked Feb 08 '13 01:02

jbatista


3 Answers

Check out this page on converting integer to base64.

import base64
import struct

def encode(n):
    data = struct.pack('<Q', n).rstrip('\x00')
    if len(data)==0:
        data = '\x00'
    s = base64.urlsafe_b64encode(data).rstrip('=')
    return s

def decode(s):
    data = base64.urlsafe_b64decode(s + '==')
    n = struct.unpack('<Q', data + '\x00'* (8-len(data)) )
    return n[0]
like image 190
msc Avatar answered Oct 22 '22 10:10

msc


The struct module:

… performs conversions between Python values and C structs represented as Python strings.

Because C doesn't have infinite-length integers, there's no functionality for packing them.

But it's very easy to write yourself. For example:

def pack_bigint(i):
    b = bytearray()
    while i:
        b.append(i & 0xFF)
        i >>= 8
    return b

Or:

def pack_bigint(i):
    bl = (i.bit_length() + 7) // 8
    fmt = '<{}B'.format(bl)
    # ...

And so on.

And of course you'll want an unpack function, like jbatista's from the comments:

def unpack_bigint(b):
    b = bytearray(b) # in case you're passing in a bytes/str
    return sum((1 << (bi*8)) * bb for (bi, bb) in enumerate(b))
like image 26
abarnert Avatar answered Oct 22 '22 10:10

abarnert


This is a bit late, but I figured I'd throw my hat in the ring:

def inttob64(n):                                                              
    """                                                                       
    Given an integer returns the base64 encoded version of it (no trailing ==)
    """
    parts = []                                                                
    while n:                                                                  
        parts.insert(0,n & limit)                                             
        n >>= 32                                                              
    data = struct.pack('>' + 'L'*len(parts),*parts)                           
    s = base64.urlsafe_b64encode(data).rstrip('=')                            
    return s                                                                  

def b64toint(s):                                                              
    """                                                                       
    Given a string with a base64 encoded value, return the integer representation
    of it                                                                     
    """                                                                       
    data = base64.urlsafe_b64decode(s + '==')                                 
    n = 0                                                                     
    while data:                                                               
        n <<= 32                                                              
        (toor,) = struct.unpack('>L',data[:4])                                
        n |= toor & 0xffffffff                                                
        data = data[4:]                                                       
    return n

These functions turn an arbitrary-sized long number to/from a big-endian base64 representation.

like image 1
Mediocre Gopher Avatar answered Oct 22 '22 10:10

Mediocre Gopher