How does Python convert bytes into float?

Tags:

I have the following code snippet:

#!/usr/bin/env python3

print(float(b'5'))

Which prints 5.0 with no error (on Linux with utf-8 encoding). I'm very surprised that it doesn't give an error since Python is not supposed to know what encoding is used for the bytes object.

Any insight?

525

asked May 18 '18 10:05

static_rtti

1 Answers

When passed a bytes object, float() treats the contents of the object as ASCII bytes. That's sufficient here, as the conversion from string to float only accepts ASCII digits and letters, plus . and _ anyway (the only non-ASCII codepoints that would be permitted are whitespace codepoints), and this is analogous to the way int() treats bytes input.

Under the hood, the implementation does this:

because the input is not a string, PyNumber_Float() is called on the object (for str objects the code jumps straight to PyFloat_FromString).
PyNumber_Float() checks for a __float__ method, but if that's not available, it calls PyFloat_FromString()
PyFloat_FromString() accepts not only str objects, but any object implementing the buffer protocol. The String name is a Python 2 holdover, the Python 3 str type is called Unicode in the C implementation.
bytes objects implement the buffer protocol, and the PyBytes_AS_STRING macro is used to access the internal C buffer holding the bytes.
A combination of two internal functions named _Py_string_to_number_with_underscores() and float_from_string_inner() is then used to parse ASCII bytes into a floating point value.

For actual str strings, the CPython implementation actually converts any non-ASCII string into a sequence of ASCII bytes by only looking at ASCII codepoints in the input value, and converting any non-ASCII whitespace character to ascii 0x20 spaces, to then use the same _Py_string_to_number_with_underscores() / float_from_string_inner() combo.

I see this as a bug in the documentation and have filed issue with the Python project to have it updated.

119

answered Nov 10 '22 01:11

Martijn Pieters

Related questions
                            
                                How to override Gunicorn's logging config to use a custom formatter
                            
                                import matplotlib failing with No module named _tkinter on heroku
                            
                                How to split a numpy array in fixed size chunks with and without overlap?
                            
                                Python: Access embedded OLE from Office/Excel document without clipboard
                            
                                About tensorflow Metadata and RunOptions
                            
                                imp module is deprecated in favour of importlib
                            
                                TensorFlow Dataset Shuffle Each Epoch
                            
                                Parse a string of multipart data
                            
                                Why does unpacking this map object print "must be an iterable, not map"?
                            
                                How to use a button to trigger callback updates?
                            
                                How does numpy.reshape() with order = 'F' work?
                            
                                Weighted mse custom loss function in keras
                            
                                Training broke with ResourceExausted error
                            
                                Saving high-resolution images with plotnine
                            
                                Save pandas dataframe with numpy arrays column
                            
                                Fit mixture of Gaussians with fixed covariance in Python
                            
                                Perceptron learning algorithm doesn't work
                            
                                Tensor objects are not iterable when eager execution is not enabled. To iterate over this tensor use tf.map_fn
                            
                                pd.Timestamp versus np.datetime64: are they interchangeable for selected uses?
                            
                                abstract classes without abstract methods creating objects in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does Python convert bytes into float?

Tags:

python

python-3.x

character-encoding

static_rtti

People also ask

1 Answers

Martijn Pieters

Recent Activity

Donate For Us