Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Set decimal precision of a pandas dataframe column with a datatype of Decimal

I have a pandas dataframe with two columns, col 1 with text in it and col 2 with decimal values.

Key Value
A 1.2089
B 5.6718
B 7.3084

I use the '.apply' function to set the data type of the value column to Decimal (Python Decimal library). Once I do this the Value column goes from a 4 decimal place value to 43 decimal places. I have attempted to use the .getcontect.prec = 4 to no avail.

The data frame is constructed from reading a CSV file with the same format as the table above. All the decimal numbers in the value column are only given to 4 decimal places.

import pandas as pd
from decimal import *

def get_df(table_filepath):
    df = pd.read_csv(table_filepath)
    getcontect.prec = 4
    df['Value'] = df['Value'].apply(Decimal)

The above code is what I have tried but still results in a output with the value column values having 43 decimal places rather than the 4 decimal places each value should have as read from the csv file.

The result I get when I print the dataframe is:

Key Value
A 1.20890000000003046807250939309597015380859375
B 5.67180000000000318323145620524883270263671875
B 7.30838399999999969077180139720439910888671875

I only want 4 decimals of precision because these values will be use to do some maths later on and I want to work with the exact values I provided.

like image 234
Ankur Avatar asked Apr 06 '21 12:04

Ankur


People also ask

How do I limit decimal places in pandas?

Lets use the dataframe. round() function to round off all the decimal values in the dataframe to 3 decimal places. Output : Example #2: Use round() function to round off all the columns in dataframe to different places.

How do you round to 2 decimal places in Python?

Python's round() function requires two arguments. First is the number to be rounded. Second argument decides the number of decimal places to which it is rounded. To round the number to 2 decimals, give second argument as 2.

How to change the data type of column in pandas Dataframe?

Let’s see the program to change the data type of column or a Series in Pandas Dataframe. Method 1: Using DataFrame.astype () method. We can pass any Python, Numpy or Pandas datatype to change all columns of a dataframe to that type, or we can pass a dictionary having column names as keys and datatype as values to change type of selected columns.

How to use decimal type in Python and pandas?

These examples show how to use Decimal type in Python and Pandas to maintain more accuracy than float. Pandas can use Decimal, but requires some care to create and maintain Decimal objects. For numbers with a decimal separator, by default Python uses float and Pandas uses numpy float64.

Is there any global setting to preserve the precision of pandas Dataframe?

Apart from applying formats to each data frame is there any global setting that helps preserving the precision. No, 34.98774564765 is merely being printed by default with six decimal places: You can change the default used for printing frames by altering pandas.options.display.precision.

How to round values in pandas Dataframe?

Here are 4 ways to round values in Pandas DataFrame: (1) Round to specific decimal places under a single DataFrame column (2) Round up values under a single DataFrame column Let’s now see how to apply the above approaches using practical examples.


Video Answer


2 Answers

This can be modified by changing the print options for floats, however it will modify how every float datatype is printed

pd.set_option('display.float_format', '{:.10f}'.format)

Keep in mind that this is only the way it's printed. The value is stored in the dataframe, with every decimal.

On the other hand, you can restrict decimals by:

df.Value = df.Value.round(4)

But this will round depending the fifth decimal. Last option would be to use np.ceil or np.floor but since this wont support decimals, an approach with multiplication and division is requierd:

precision = 4
df['Value_ceil'] = np.ceil(df.Value * 10**precision) / (10**precision)
df['Value_floor'] = np.floor(df.Value * 10**precision) / (10**precision)
like image 50
jcaliz Avatar answered Sep 18 '22 17:09

jcaliz


Fixed the issue, seems to be related to how Decimal converts from float to decimal. Setting the Values column to be of data type string then converting to Decimal got me the result I desired.

def get_df(table_filepath):
    df = pd.read_csv(table_filepath)
    df['Value'] = df['Value'].apply(str) 
    df['Value'] = df['Value'].apply(Decimal)
Key Value
A 1.2089
B 5.6718
B 7.3084
like image 34
Ankur Avatar answered Sep 22 '22 17:09

Ankur