I do not understand why pandas dataframe is rounding the values in my column where I divide the values of two other columns. I want the numbers in the new colums with two decimals, but the values are rounded. I have checked the dtypes of the columns and both are "float64".
import pandas as pd
import numpy as np
# CURRENT DIRECTORY
cd = os.path.dirname(os.getcwd())
# concatenate csv files
dfList = []
for root, dirs, files in os.walk(cd):
for fname in files:
if re.match("output_contigs_SCMgenes.csv", fname):
frame = pd.read_csv(os.path.join(root, fname))
dfList.append(frame)
df = pd.concat(dfList)
#replace nan in SCM column with 0
df['SCM'].fillna(0, inplace=True)
#add column with genes/SCM
df['genes/SCM'] = df['genes']/df['SCM']
The output is as follows:
genome contig genes SCM genes/SCM
0 20900 48 1 0 inf
1 20900 37 130 103 1
2 20900 35 1 1 1
3 20900 1 79 66 1
4 20900 66 5 3 2
But I want that my last column does not contain rounded values, but values with at least 2 decimals.
I could reproduce this behaviour by setting the pd.options.display.precision
to 0
:
In [4]: df['genes/SCM'] = df['genes']/df['SCM']
In [5]: df
Out[5]:
genome contig genes SCM genes/SCM
0 20900 48 1 0 inf
1 20900 37 130 103 1.262136
2 20900 35 1 1 1.000000
3 20900 1 79 66 1.196970
4 20900 66 5 3 1.666667
In [6]: pd.options.display.precision = 0
In [7]: df
Out[7]:
genome contig genes SCM genes/SCM
0 20900 48 1 0 inf
1 20900 37 130 103 1
2 20900 35 1 1 1
3 20900 1 79 66 1
4 20900 66 5 3 2
Check your Pandas & Numpy options
For rounding off with desired number of digits after decimal e.g. 2 digits after decimal as asked in the question
df.round({'genes/SCM': 2})
for multiple columns
df.round({'col1_name': 1, 'col2_name': 2})
Also, check precision is not set to 0, pd.set_option('precision', 5)
can be used to set the precision appropriately. Here 5 is number of desired digits needed after decimal as an example.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With