How exactly is a Decimal object encoded in python?

Tags:

decimal

I'm currently writing code using decimal.Decimal in python (v3.8.5).

I was wondering if anyone knows how the Decimal object is actually encoded.

I can't understand why the memory size is the same even if I change getcontext().prec, which is equal to change coefficients and exponent in decimal floating-points, as follows

from decimal import *
from sys import getsizeof

## coefficient bits = 3
getcontext().prec = 3

temp = Decimal('1')/Decimal('3')

print(temp.as_tuple()) >>> DecimalTuple(sign=0, digits=(3, 3, 3), exponent=-3)
print(getsizeof(temp)) >>> 104

## coefficient bits = 30
getcontext().prec = 30

temp = Decimal('1')/Decimal('3')

print(temp.as_tuple()) >>> DecimalTuple(sign=0, digits=(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3), exponent=-30)
print(getsizeof(temp)) >>> 104

In order to understand the above behavior, I read the source code of Decimal class and the attached document.

https://github.com/python/cpython/blob/main/Lib/_pydecimal.py
http://speleotrove.com/decimal/decarith.html
http://speleotrove.com/decimal/decbits.pdf

According to the document, Python's Decimal object is implemented based on IEEE 754-2008, and the decimal digits of coefficient continuation are converted into binary digits using DPD (Densely packed decimal) encoding.

Therefore, according to the DPD algorithm, we can calculate the number of bits when the decimal digits of the coefficient continuation are encoded into binary digits.

And since the sign, exponent continuation, and combination field are simply expressed in binary, the number of bits when encoded can be easily calculated.

So, we can calculate the number of bits when Decimal obejcet is encoded by the following formula. bits = (sign) + (exp) + (comb) + (compressed coeff)

Here, sign and combination are fixed at 1bit and 5bits, respectively (according to the definition of IEEE 754-2008. https://en.wikipedia.org/wiki/Decimal_floating_point).

So, I wrote the above code to check the list of {sign, exponent, coefficient} using as_tuple() of the Decimal object, and calculate the actual number of bits in memory.

However, as mentioned above, the memory size of the Decimal object did not change at all, even though the number of digits in the coefficient should have changed. (I understand that a Decimal object is not only a decimal encoding but also a list and other objects.)

The following two questions arise.

(1) Am I wrong in my understanding of the encoding algorithm of the Decimal object in python? (Does python3.8.5 use a more efficient encoding algorithm than IEEE 754-2008?)

(2) Assuming that my understanding of the algorithm is correct, why does the memory size of the Decimal object remain the same even though the coefficient has been changed? (According to the definition of IEEE754-2008, when coefficient continuation is changed, exponent continuation is also changed, and total bits should be changed.)

I myself am a student who usually studies in the field of mechanical engineering, and I am a complete beginner in informatics. If there is any part of my original understanding that is wrong or if there is any strange logical development, please let me know.

I appreciate your help.

351

asked Aug 05 '21 04:08

TMo

1 Answers

For sys.getsizeof:

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

Since Decimal is a Python class with references to several other objects (EDIT: see below), you just get the total size of the references, which is constant — not including the referred values, which are not.

getcontext().prec = 3
temp = Decimal(3) / Decimal(1)
print(sys.getsizeof(temp))
print(sys.getsizeof(temp._int))

getcontext().prec = 300
temp = Decimal(3) / Decimal(1)
print(sys.getsizeof(temp))         # same
print(sys.getsizeof(temp._int))    # not same

(Note that _int slot I used in the example is an internal implementation detail of CPython's Decimal, as hinted by the leading underscore; this code is not guaranteed to work in other Python implementations, or even in other versions.)

EDIT: Oops, my first answer was on an old Python, where Decimal is implemented in Python. The version you asked about has it implemented in C.

The C version actually stores everything inside the object itself, but your difference in precision was not sufficient to detect the difference (as memory is allocated in discrete chunks). Try it with getcontext().prec = 300 instead.

116

answered Oct 28 '22 08:10

Amadan

Related questions
                            
                                Python dictionary with multiple keys pointing to same list in memory efficient way
                            
                                Artifact storage and MLFLow on remote server
                            
                                Get the most efficient combination of a large List of objects based on a field
                            
                                How to Reference a Pandas Column that has a dot in the name
                            
                                How to change batch size dynamically in Tensorflow 2.0 Dataset?
                            
                                Python logging - filter log messages for all loggers
                            
                                How do I get the current 'package' name? (setup.py)
                            
                                How to output the second layer of a network?
                            
                                Sudoku Puzzle with boxes containing square numbers
                            
                                conda install psycopg2 errors
                            
                                Pycharm does not recognize logging.basicConfig handlers argument
                            
                                xarray reverse interpolation (on coordinate, not on data)
                            
                                Upgrade to Ubuntu 20.04 killed pip
                            
                                Why isn't django serving my SPA static files correctly?
                            
                                Changing colours of an area in an image using opencv in python
                            
                                Linking pyenv python to homebrew in order to avoid homebrew [email protected] installation
                            
                                Converting TensorFlow tensor into Numpy array
                            
                                pandas deprecated warning to_dict()
                            
                                The difference between opencv-python and opencv-contrib-python
                            
                                How to find the common eigenvectors of two matrices with distincts eigenvalues

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How exactly is a Decimal object encoded in python?

Tags:

python

decimal

TMo

People also ask

1 Answers

Amadan

Recent Activity

Donate For Us