Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Divide One Pandas Dataframe by Another - Ignore index but respect columns

I have 2 dataframes. I would like to broadcast a divide operation

df1= pd.DataFrame([[1.,2.,3.,4.], [5.,6.,7.,8.], [9.,10.,11.,12.]],
                  columns=['A','B','C','D'], index=['x','y','z'])

df2= pd.DataFrame([[0.,1.,2.,3.]], columns=['A','B','D','C'], index=['q'])

Notice that the columns are aligned slightly differently in df2.

I would like to divide df1 by df2 where the row is broadcast but the column labels are respected.

   A   B   C   D
x  1   2   3   4
y  5   6   7   8
z  9  10  11  12


   A  B  D  C
q  0  1  2  3

This would be wrong.

df1.values/df2.values

[[         inf   2.           1.5          1.33333333]
 [         inf   6.           3.5          2.66666667]
 [         inf  10.           5.5          4.        ]]

Answer I desire is:

   A    B   C      D
x  inf  2   1      2
y  inf  6   2.33   4
z  inf  10  3.66   6
like image 503
Dickster Avatar asked May 19 '15 13:05

Dickster


People also ask

How do you divide one DataFrame by another panda?

div() method divides element-wise division of one pandas DataFrame by another. DataFrame elements can be divided by a pandas series or by a Python sequence as well. Calling div() on a DataFrame instance is equivalent to invoking the division operator (/).

How do I divide values in two columns in pandas DataFrame?

The simple division (/) operator is the first way to divide two columns. You will split the First Column with the other columns here. This is the simplest method of dividing two columns in Pandas. We will import Pandas and take at least two columns while declaring the variables.

How do I divide one column by another column in a DataFrame?

To divide each column by a particular column, we can use division sign (/). For example, if we have a data frame called df that contains three columns say x, y, and z then we can divide all the columns by column z using the command df/df[,3].

Does pandas STD ignore NaN?

std() function to calculate the standard deviation of values in a pandas DataFrame. Note that the std() function will automatically ignore any NaN values in the DataFrame when calculating the standard deviation.


2 Answers

If you divide by a Series (by selecting that one row of the second dataframe), pandas will align this series on the columns of the first dataframe, giving the desired result:

In [75]: df1 / df2.loc['q']
Out[75]:
     A   B         C  D
x  inf   2  1.000000  2
y  inf   6  2.333333  4
z  inf  10  3.666667  6

If you don't know/want to use the name of that one row, you can use squeeze to convert the one-column dataframe to a series: df1 / df2.squeeze() (see answer of @EdChum).

like image 172
joris Avatar answered Sep 20 '22 03:09

joris


May be, you could order your df2 columns same of df1 and then divide on values

In [53]: df1.values/df2[df1.columns].values
Out[53]:
array([[         inf,   2.        ,   1.        ,   2.        ],
       [         inf,   6.        ,   2.33333333,   4.        ],
       [         inf,  10.        ,   3.66666667,   6.        ]])
like image 29
Zero Avatar answered Sep 22 '22 03:09

Zero