Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is python pandas dataframe rounding my values?

I do not understand why pandas dataframe is rounding the values in my column where I divide the values of two other columns. I want the numbers in the new colums with two decimals, but the values are rounded. I have checked the dtypes of the columns and both are "float64".

import pandas as pd
import numpy as np


# CURRENT DIRECTORY 
cd = os.path.dirname(os.getcwd())

# concatenate csv files
dfList = []

for root, dirs, files in os.walk(cd):
    for fname in files:
        if re.match("output_contigs_SCMgenes.csv", fname):
            frame = pd.read_csv(os.path.join(root, fname))
            dfList.append(frame)    

df = pd.concat(dfList)

#replace nan in SCM column with 0
df['SCM'].fillna(0, inplace=True)

#add column with genes/SCM
df['genes/SCM'] = df['genes']/df['SCM']

The output is as follows:

    genome  contig  genes  SCM  genes/SCM
0    20900      48      1    0        inf
1    20900      37    130  103          1
2    20900      35      1    1          1
3    20900       1     79   66          1
4    20900      66      5    3          2

But I want that my last column does not contain rounded values, but values with at least 2 decimals.

like image 754
Gravel Avatar asked Apr 05 '17 09:04

Gravel


2 Answers

I could reproduce this behaviour by setting the pd.options.display.precision to 0:

In [4]: df['genes/SCM'] = df['genes']/df['SCM']

In [5]: df
Out[5]:
   genome  contig  genes  SCM  genes/SCM
0   20900      48      1    0        inf
1   20900      37    130  103   1.262136
2   20900      35      1    1   1.000000
3   20900       1     79   66   1.196970
4   20900      66      5    3   1.666667

In [6]: pd.options.display.precision = 0

In [7]: df
Out[7]:
   genome  contig  genes  SCM  genes/SCM
0   20900      48      1    0        inf
1   20900      37    130  103          1
2   20900      35      1    1          1
3   20900       1     79   66          1
4   20900      66      5    3          2

Check your Pandas & Numpy options

like image 108
MaxU - stop WAR against UA Avatar answered Oct 04 '22 01:10

MaxU - stop WAR against UA


For rounding off with desired number of digits after decimal e.g. 2 digits after decimal as asked in the question

df.round({'genes/SCM': 2})

for multiple columns

df.round({'col1_name': 1, 'col2_name': 2})

Also, check precision is not set to 0, pd.set_option('precision', 5) can be used to set the precision appropriately. Here 5 is number of desired digits needed after decimal as an example.

like image 22
Nafeez Quraishi Avatar answered Oct 03 '22 23:10

Nafeez Quraishi