Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How exactly is a Decimal object encoded in python?

Tags:

python

decimal

I'm currently writing code using decimal.Decimal in python (v3.8.5).

I was wondering if anyone knows how the Decimal object is actually encoded.

I can't understand why the memory size is the same even if I change getcontext().prec, which is equal to change coefficients and exponent in decimal floating-points, as follows

from decimal import *
from sys import getsizeof

## coefficient bits = 3
getcontext().prec = 3

temp = Decimal('1')/Decimal('3')

print(temp.as_tuple()) >>> DecimalTuple(sign=0, digits=(3, 3, 3), exponent=-3)
print(getsizeof(temp)) >>> 104

## coefficient bits = 30
getcontext().prec = 30

temp = Decimal('1')/Decimal('3')

print(temp.as_tuple()) >>> DecimalTuple(sign=0, digits=(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3), exponent=-30)
print(getsizeof(temp)) >>> 104

In order to understand the above behavior, I read the source code of Decimal class and the attached document.

  • https://github.com/python/cpython/blob/main/Lib/_pydecimal.py
  • http://speleotrove.com/decimal/decarith.html
  • http://speleotrove.com/decimal/decbits.pdf

According to the document, Python's Decimal object is implemented based on IEEE 754-2008, and the decimal digits of coefficient continuation are converted into binary digits using DPD (Densely packed decimal) encoding.

Therefore, according to the DPD algorithm, we can calculate the number of bits when the decimal digits of the coefficient continuation are encoded into binary digits.

And since the sign, exponent continuation, and combination field are simply expressed in binary, the number of bits when encoded can be easily calculated.

So, we can calculate the number of bits when Decimal obejcet is encoded by the following formula. bits = (sign) + (exp) + (comb) + (compressed coeff)

Here, sign and combination are fixed at 1bit and 5bits, respectively (according to the definition of IEEE 754-2008. https://en.wikipedia.org/wiki/Decimal_floating_point).

So, I wrote the above code to check the list of {sign, exponent, coefficient} using as_tuple() of the Decimal object, and calculate the actual number of bits in memory.

However, as mentioned above, the memory size of the Decimal object did not change at all, even though the number of digits in the coefficient should have changed. (I understand that a Decimal object is not only a decimal encoding but also a list and other objects.)

The following two questions arise.

(1) Am I wrong in my understanding of the encoding algorithm of the Decimal object in python? (Does python3.8.5 use a more efficient encoding algorithm than IEEE 754-2008?)

(2) Assuming that my understanding of the algorithm is correct, why does the memory size of the Decimal object remain the same even though the coefficient has been changed? (According to the definition of IEEE754-2008, when coefficient continuation is changed, exponent continuation is also changed, and total bits should be changed.)

I myself am a student who usually studies in the field of mechanical engineering, and I am a complete beginner in informatics. If there is any part of my original understanding that is wrong or if there is any strange logical development, please let me know.

I appreciate your help.

like image 351
TMo Avatar asked Aug 05 '21 04:08

TMo


People also ask

How do you serialize decimal objects in Python?

To JSON serialize a Decimal object with Python, we can use the json. dumps method from the simplejson module with use_decimal set to True . We use the Decimal constructor to create a decimal number object.

What does decimal () do in Python?

Summary. Use the Python decimal module when you want to support fast correctly-rounded decimal floating-point arithmetic. Use the Decimal class from the decimal module to create Decimal object from strings, integers, and tuples. The Decimal numbers have a context that controls the precision and rounding mechanism.

What is the Python data type for representing decimal numbers?

Python has always supported floating-point (FP) numbers, based on the underlying C double type, as a data type.

What is decimal in Python?

/Decimal Module in Python Decimal Module in Python Author: Aditya Raj Last Updated: June 24, 2021 Python has numeric data typeslike int, float and complex numbers but due to the machine dependent nature of floating point numbers, we need a more precise data type for calculations which demand high precision.

What is the difference between floats and decimal numbers in Python?

The Python decimal module supports arithmetic that works the same as the arithmetic you learn at school. Unlike floats, Python represents decimal numbers exactly. And the exactness carries over into arithmetic. For example, the following expression returns exactly 0.0: Decimal ( '0.1') + Decimal ( '0.1') + Decimal ( '0.1') - Decimal ( '0.3')

How do you round a decimal in Python?

After the with block, Python uses the default rounding mechanism. The Decimal constructor allows you to create a new Decimal object based on a value: The value argument can be an integer, string, tuple, float, or another Decimal object. If you don’t provide the value argument, it defaults to '0'.

What is the use of the decimal module?

The decimal module provides support for fast correctly-rounded decimal floating point arithmetic. It offers several advantages over the float datatype:


1 Answers

For sys.getsizeof:

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

Since Decimal is a Python class with references to several other objects (EDIT: see below), you just get the total size of the references, which is constant — not including the referred values, which are not.

getcontext().prec = 3
temp = Decimal(3) / Decimal(1)
print(sys.getsizeof(temp))
print(sys.getsizeof(temp._int))

getcontext().prec = 300
temp = Decimal(3) / Decimal(1)
print(sys.getsizeof(temp))         # same
print(sys.getsizeof(temp._int))    # not same

(Note that _int slot I used in the example is an internal implementation detail of CPython's Decimal, as hinted by the leading underscore; this code is not guaranteed to work in other Python implementations, or even in other versions.)


EDIT: Oops, my first answer was on an old Python, where Decimal is implemented in Python. The version you asked about has it implemented in C.

The C version actually stores everything inside the object itself, but your difference in precision was not sufficient to detect the difference (as memory is allocated in discrete chunks). Try it with getcontext().prec = 300 instead.

like image 116
Amadan Avatar answered Oct 28 '22 08:10

Amadan