Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Polars DataFrame - Decimal Precision doubles on mul with Integer

I have a Polars (v1.5.0) dataframe with 4 columns as shown in example below. When I multiply decimal columns with an integer column, the scale of the resultant decimal column doubles.

from decimal import Decimal
import polars as pl

df = pl.DataFrame({
    "a": [1, 2],
    "b": [Decimal('3.45'), Decimal('4.73')],
    "c": [Decimal('2.113'), Decimal('4.213')],
    "d": [Decimal('1.10'), Decimal('3.01')]
})
shape: (2, 4)
┌─────┬──────────────┬──────────────┬──────────────┐
│ a   ┆ b            ┆ c            ┆ d            │
│ --- ┆ ---          ┆ ---          ┆ ---          │
│ i64 ┆ decimal[*,2] ┆ decimal[*,3] ┆ decimal[*,2] │
╞═════╪══════════════╪══════════════╪══════════════╡
│ 1   ┆ 3.45         ┆ 2.113        ┆ 1.10         │
│ 2   ┆ 4.73         ┆ 4.213        ┆ 3.01         │
└─────┴──────────────┴──────────────┴──────────────┘
df.with_columns(pl.col("c", "d").mul(pl.col("a")))
shape: (2, 4)
┌─────┬──────────────┬──────────────┬──────────────┐
│ a   ┆ b            ┆ c            ┆ d            │
│ --- ┆ ---          ┆ ---          ┆ ---          │
│ i64 ┆ decimal[*,2] ┆ decimal[*,6] ┆ decimal[*,4] │
╞═════╪══════════════╪══════════════╪══════════════╡
│ 1   ┆ 3.45         ┆ 2.113000     ┆ 1.1000       │
│ 2   ┆ 4.73         ┆ 8.426000     ┆ 6.0200       │
└─────┴──────────────┴──────────────┴──────────────┘

I don't know why the scale doubles, when I am just multiplying a decimal with an integer. What do I do so that the scale does not change?

like image 349
fishfin Avatar asked Sep 14 '25 06:09

fishfin


1 Answers

The scale indeed seems to double. You could cast back to the original dtype:

cols = ['c', 'd', 'e']
df.with_columns(pl.col(c).mul(pl.col('a')).cast(df[c].dtype) for c in cols)

Note that there currently doesn't seem to be a way to access the dtype in an Expr, but this is a discussed feature.

Example:

┌─────┬─────┬──────────────┬──────────────┬──────────────┐
│ a   ┆ b   ┆ c            ┆ d            ┆ e            │
│ --- ┆ --- ┆ ---          ┆ ---          ┆ ---          │
│ i64 ┆ i64 ┆ decimal[*,2] ┆ decimal[*,3] ┆ decimal[*,4] │
╞═════╪═════╪══════════════╪══════════════╪══════════════╡
│ 1   ┆ 3   ┆ 2.11         ┆ 1.100        ┆ 1.1001       │
│ 2   ┆ 4   ┆ 8.42         ┆ 6.022        ┆ 6.0004       │
└─────┴─────┴──────────────┴──────────────┴──────────────┘

Used input:

from decimal import Decimal
df = pl.DataFrame({
    "a": [1, 2],
    "b": [3, 4],
    "c": [Decimal('2.11'), Decimal('4.21')],
    "d": [Decimal('1.10'), Decimal('3.011')],
    "e": [Decimal('1.1001'), Decimal('3.0002')],
})
like image 94
mozway Avatar answered Sep 15 '25 21:09

mozway