Base64 Incorrect padding error using Python

Tags:

I am trying to decode Base64 into Hex for about 200 Base64 data and I am getting this following error. It does decoding for 60 of them then stops.

ABHvPdSaxrhjAWA=
0011ef3dd49ac6b8630160
ABHPdSaxrhjAWA=
Traceback (most recent call last):
  File "tt.py", line 36, in <module>
    csvlines[0] = csvlines[0].decode("base64").encode("hex")
  File "C:\Python27\lib\encodings\base64_codec.py", line 43, in base64_decode
    output = base64.decodestring(input)
  File "C:\Python27\lib\base64.py", line 325, in decodestring
    return binascii.a2b_base64(s)
binascii.Error: Incorrect padding

Some original Base64 source from CSV

ABHPdSaxrhjAWA=
ABDPdSaxrhjAWA=
ABDPdSaxrhjAWA=
ABDPdSaxrhjAWA=
ABDPdSaxrhjAWA=
ABDPdSaxrhjAWA=
ABDPdS4xriiAVQ=
ABDPdSqxrizAU4=
ABDPdSrxrjPAUo=

995

asked Nov 21 '16 20:11

James

1 Answers

You have at least one string in your CSV file that is either not a Base64 string, is a corrupted (damaged) Base64 string, or is a string that is missing the required = padding. Your example value, ABHPdSaxrhjAWA=, is short one = or is missing another data character.

Base64 strings, properly padded, have a length that is a multiple of 4, so you can easily re-add the padding:

value = csvlines[0]
if len(value) % 4:
    # not a multiple of 4, add padding:
    value += '=' * (4 - len(value) % 4) 
csvlines[0] = value.decode("base64").encode("hex")

If the value then still fails to decode, then your input was corrupted or not valid Base64 to begin with.

For the example error, ABHPdSaxrhjAWA=, the above adds one = to make it decodable:

>>> value = 'ABHPdSaxrhjAWA='
>>> if len(value) % 4:
...     # not a multiple of 4, add padding:
...     value += '=' * (4 - len(value) % 4)
...
>>> value
'ABHPdSaxrhjAWA=='
>>> value.decode('base64')
'\x00\x11\xcfu&\xb1\xae\x18\xc0X'
>>> value.decode('base64').encode('hex')
'0011cf7526b1ae18c058'

I need to emphasise that your data may simply be corrupted. Your console output includes one value that worked, and one that failed. The one that worked is one character longer, and that's the only difference:

ABHvPdSaxrhjAWA=
ABHPdSaxrhjAWA=

Note the v in the 4th place; this is missing from the second example. This could indicate that something happened to your CSV data that caused that character to be dropped from the second example. Adding in padding can make the second value decodable again, but the result would be wrong. We can't tell you which of those two options is the cause here.

147

answered Oct 02 '22 13:10

Martijn Pieters

Related questions
                            
                                Pandas pivot table: columns order and subtotals
                            
                                Identifying closest value in a column for each filter using Pandas
                            
                                Why does __slots__ = ('__dict__',) produce smaller instances?
                            
                                Encrypt in python and decrypt in Java with AES-CFB
                            
                                Python2: Using .decode with errors='replace' still returns errors
                            
                                Benefits of using enumerate?
                            
                                Does Django used the same instance of class views per request?
                            
                                Converting a 3D List to a 3D NumPy array
                            
                                Make dice values NOT repeat in if statement
                            
                                Django inheritance and parent object related name
                            
                                Keras ImageDataGenerator setting mean and std
                            
                                Add columns in pandas dataframe dynamically
                            
                                How to get SNS published message
                            
                                How to sum the nlargest() integers in groupby [duplicate]
                            
                                Django migrations. How to check if table exists in migrations?
                            
                                Python ElementTree "Invalid descendant" error
                            
                                Python Plotly Multiple Histogram with Mean Line
                            
                                How to sum the values of list to the power of their indices
                            
                                Detect if mouse has left Pygame window
                            
                                cumulative argmax of a numpy array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Base64 Incorrect padding error using Python

Tags:

python

hex

base64

James

People also ask

1 Answers

Martijn Pieters

Recent Activity

Donate For Us