The purpose of <code>base64.b64encode()</code> is to convert binary data into ASCII-safe "text". However, the method returns an object of type bytes: <pre class="prettyprint"><code>>>> import base64 >>> base64.b64encode(b'abc') b'YWJj' </code></pre> It's easy to simply take that output and <code>decode()</code> it, but my question is: what is a significance of <code>base64.b64encode()</code> returning <code>bytes</code> rather than a <code>str</code>?

It's impossible for <code>b64encode()</code> to know what you want to do with its output. While in many cases you may want to treat the encoded value as text, in many others – for example, sending it over a network – you may instead want to treat it as bytes. Since <code>b64encode()</code> can't know, it refuses to guess. And since the input is <code>bytes</code>, the output remains the same type, rather than being implicitly coerced to <code>str</code>. As you point out, decoding the output to <code>str</code> is straightforward: <pre class="prettyprint"><code>base64.b64encode(b'abc').decode('ascii') </code></pre> ... as well as being explicit about the result. As an aside, it's worth noting that although <code>base64.b64decode()</code> (note: decode, not encode) has accepted <code>str</code> since version 3.3, the change was somewhat controversial.

Why does base64.b64encode() return a bytes object?

Tags:

python

python-3.x

encoding

unicode

base64

The purpose of base64.b64encode() is to convert binary data into ASCII-safe "text". However, the method returns an object of type bytes:

>>> import base64 >>> base64.b64encode(b'abc') b'YWJj'

It's easy to simply take that output and decode() it, but my question is: what is a significance of base64.b64encode() returning bytes rather than a str?

788

asked Mar 13 '17 21:03

gardarh

2 Answers

The purpose of the base64.b64encode() function is to convert binary data into ASCII-safe "text"

Python disagrees with that - base64 has been intentionally classified as a binary transform.

It was a design decision in Python 3 to force the separation of bytes and text and prohibit implicit transformations. Python is now so strict about this that bytes.encode doesn't even exist, and so b'abc'.encode('base64') would raise an AttributeError.

The opinion the language takes is that a bytestring object is already encoded. A codec which encodes bytes into text does not fit into this paradigm, because when you want to go from the bytes domain to the text domain it's a decode. Note that rot13 encoding was also banished from the list of standard encodings for the same reason - it didn't fit properly into the Python 3 paradigm.

There also can be a performance argument to make: suppose Python automatically handled decoding of the base64 output, which is an ASCII-encoded binary representation produced by C code from the binascii module, into a Python object in the text domain. If you actually wanted the bytes, you would just have to undo the decoding by encoding into ASCII again. It would be a wasteful round-trip, an unnecessary double-negation. Better to 'opt-in' for the decode-to-text step.

answered Oct 13 '22 01:10

wim

It's impossible for b64encode() to know what you want to do with its output.

While in many cases you may want to treat the encoded value as text, in many others – for example, sending it over a network – you may instead want to treat it as bytes.

Since b64encode() can't know, it refuses to guess. And since the input is bytes, the output remains the same type, rather than being implicitly coerced to str.

As you point out, decoding the output to str is straightforward:

base64.b64encode(b'abc').decode('ascii')

... as well as being explicit about the result.

As an aside, it's worth noting that although base64.b64decode() (note: decode, not encode) has accepted str since version 3.3, the change was somewhat controversial.

answered Oct 13 '22 01:10

Zero Piraeus

Related questions
                            
                                How do I properly override __setattr__ and __getattribute__ on new-style classes in Python?
                            
                                what is the difference for python between lambda and regular function?
                            
                                Efficient creation of numpy arrays from list comprehension and in general
                            
                                Interactive input/output using Python
                            
                                Python unittest's assertDictContainsSubset recommended alternative [duplicate]
                            
                                Optional dependencies in distutils / pip
                            
                                Is a day always 86,400 epoch seconds long?
                            
                                Store mouse click event coordinates with matplotlib
                            
                                Getting gradient of model output w.r.t weights using Keras
                            
                                What is the runtime complexity of python list functions?
                            
                                RegEx with multiple groups?
                            
                                Python "set" with duplicate/repeated elements
                            
                                Difference between plt.draw() and plt.show() in matplotlib
                            
                                When do I need to call mainloop in a Tkinter application?
                            
                                PIL: Convert Bytearray to Image
                            
                                PermissionError: [WinError 5] Access is denied python using moviepy to write gif
                            
                                Matplotlib: TypeError: can't multiply sequence by non-int of type 'numpy.float64'
                            
                                Why can't I use a starred expression?
                            
                                Python windows service "Error starting service: The service did not respond to the start or control request in a timely fashion"
                            
                                Is there an easy way in Python to wait until certain condition is true?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With