Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to apply a function on every row on a dataframe?

I am new to Python and I am not sure how to solve the following problem.

I have a function:

def EOQ(D,p,ck,ch):     Q = math.sqrt((2*D*ck)/(ch*p))     return Q 

Say I have the dataframe

df = pd.DataFrame({"D": [10,20,30], "p": [20, 30, 10]})      D   p 0   10  20 1   20  30 2   30  10  ch=0.2 ck=5 

And ch and ck are float types. Now I want to apply the formula to every row on the dataframe and return it as an extra row 'Q'. An example (that does not work) would be:

df['Q']= map(lambda p, D: EOQ(D,p,ck,ch),df['p'], df['D'])  

(returns only 'map' types)

I will need this type of processing more in my project and I hope to find something that works.

like image 305
Koen Avatar asked Nov 04 '15 09:11

Koen


People also ask

How do I apply a function to every column in a DataFrame?

Python's Pandas Library provides an member function in Dataframe class to apply a function along the axis of the Dataframe i.e. along each row or column i.e. Important Arguments are: func : Function to be applied to each column or row. This function accepts a series and returns a series.

How do I apply a formula to an entire column in pandas?

Pandas. dataframe. apply() function is used to apply the function along the axis of a DataFrame. Objects passed to that function are Series objects whose index is either a DataFrame's index (axis=0) or a DataFrame's columns (axis=1).

How do I apply a function in pandas?

Pandas DataFrame apply() MethodThe apply() method allows you to apply a function along one of the axis of the DataFrame, default 0, which is the index (row) axis.

How do you apply a user defined function to a DataFrame in Python?

Pandas DataFrame apply() is a library function that allows the users to pass a function and apply it to every value of the Series. To apply a function to every row in a Pandas DataFrame, use the Pandas df. apply() function.


1 Answers

The following should work:

def EOQ(D,p,ck,ch):     Q = math.sqrt((2*D*ck)/(ch*p))     return Q ch=0.2 ck=5 df['Q'] = df.apply(lambda row: EOQ(row['D'], row['p'], ck, ch), axis=1) df 

If all you're doing is calculating the square root of some result then use the np.sqrt method this is vectorised and will be significantly faster:

In [80]: df['Q'] = np.sqrt((2*df['D']*ck)/(ch*df['p']))  df Out[80]:     D   p          Q 0  10  20   5.000000 1  20  30   5.773503 2  30  10  12.247449 

Timings

For a 30k row df:

In [92]:  import math ch=0.2 ck=5 def EOQ(D,p,ck,ch):     Q = math.sqrt((2*D*ck)/(ch*p))     return Q  %timeit np.sqrt((2*df['D']*ck)/(ch*df['p'])) %timeit df.apply(lambda row: EOQ(row['D'], row['p'], ck, ch), axis=1) 1000 loops, best of 3: 622 µs per loop 1 loops, best of 3: 1.19 s per loop 

You can see that the np method is ~1900 X faster

like image 124
EdChum Avatar answered Oct 06 '22 08:10

EdChum