python pandas data frame if else without iterating thought data frame

Question

I want to add a column to a df. The values of this new df will be dependent upon the values of the other columns. eg

dc = {'A':[0,9,4,5],'B':[6,0,10,12],'C':[1,3,15,18]}
df = pd.DataFrame(dc)
   A   B   C
0  0   6   1
1  9   0   3
2  4  10  15
3  5  12  18

Now I want to add another column D whose values will depend on values of A,B,C. So for example if was iterating through the df I would just do:

for row in df.iterrows():
    if(row['A'] != 0 and row[B] !=0):
         row['D'] = (float(row['A'])/float(row['B']))*row['C']
    elif(row['C'] ==0 and row['A'] != 0 and row[B] ==0):
         row['D'] == 250.0
    else:
         row['D'] == 20.0

Is there a way to do this without the for loop or using where () or apply () functions.

Thanks

TomAugspurger · Accepted Answer

apply should work well for you:

In [20]: def func(row):
            if (row == 0).all():
                return 250.0
            elif (row[['A', 'B']] != 0).all():
                return (float(row['A']) / row['B'] ) * row['C']
            else:
                return 20
       ....:     


In [21]: df['D'] = df.apply(func, axis=1)

In [22]: df
Out[22]: 
   A   B   C     D
0  0   6   1  20.0
1  9   0   3  20.0
2  4  10  15   6.0
3  5  12  18   7.5

[4 rows x 4 columns]

fantabolous · Answer

.where can be much faster than .apply, so if all you're doing is if/elses then I'd aim for .where. As you're returning scalars in some cases, np.where will be easier to use than pandas' own .where.

import pandas as pd
import numpy as np
df['D'] = np.where((df.A!=0) & (df.B!=0), ((df.A/df.B)*df.C),
          np.where((df.C==0) & (df.A!=0) & (df.B==0), 250,
          20))

   A   B   C     D
0  0   6   1  20.0
1  9   0   3  20.0
2  4  10  15   6.0
3  5  12  18   7.5

For a tiny df like this, you wouldn't need to worry about speed. However, on a 10000 row df of randn, this is almost 2000 times faster than the .apply solution above: 3ms vs 5850ms. That said if speed isn't a concern, then .apply can often be easier to read.

python pandas data frame if else without iterating thought data frame

Tags:

python

pandas

dataframe

numpy

cryp

2 Answers

TomAugspurger

fantabolous

Recent Activity

Donate For Us

python pandas data frame if else without iterating thought data frame

Tags:

python

pandas

dataframe

numpy

cryp

2 Answers

TomAugspurger

fantabolous

Related questions

Recent Activity

Donate For Us