Pandas apply but only for rows where a condition is met

Tags:

I would like to use Pandas df.apply but only for certain rows

As an example, I want to do something like this, but my actual issue is a little more complicated:

import pandas as pd import math z = pd.DataFrame({'a':[4.0,5.0,6.0,7.0,8.0],'b':[6.0,0,5.0,0,1.0]}) z.where(z['b'] != 0, z['a'] / z['b'].apply(lambda l: math.log(l)), 0)

What I want in this example is the value in 'a' divided by the log of the value in 'b' for each row, and for rows where 'b' is 0, I simply want to return 0.

439

asked Nov 18 '15 00:11

mgoldwasser

1 Answers

The other answers are excellent, but I thought I'd add one other approach that can be faster in some circumstances – using broadcasting and masking to achieve the same result:

import numpy as np  mask = (z['b'] != 0) z_valid = z[mask]  z['c'] = 0 z.loc[mask, 'c'] = z_valid['a'] / np.log(z_valid['b'])

Especially with very large dataframes, this approach will generally be faster than solutions based on apply().

141

answered Oct 03 '22 08:10

jakevdp

Related questions
                            
                                print(__doc__) in Python 3 script
                            
                                I have 2 versions of python installed, but cmake is using older version. How do I force cmake to use the newer version?
                            
                                Return value from thread
                            
                                How to test a function with input call?
                            
                                Specifying the order of matplotlib layers
                            
                                Lost connection to MySQL server during query
                            
                                PyCharm with Pyenv
                            
                                django - convert a list back to a queryset [duplicate]
                            
                                Python replace string pattern with output of function
                            
                                reading and parsing a TSV file, then manipulating it for saving as CSV (*efficiently*)
                            
                                How to map number to color using matplotlib's colormap?
                            
                                Print "\n" or newline characters as part of the output on terminal
                            
                                Matplotlib cannot find basic fonts
                            
                                Detect File Change Without Polling [duplicate]
                            
                                How to deal with Pylint's "too-many-instance-attributes" message?
                            
                                How to get files in a directory, including all subdirectories
                            
                                How to simply add a column level to a pandas dataframe
                            
                                Pandas get frequency of item occurrences in a column as percentage [duplicate]
                            
                                How to implement a Boolean search with multiple columns in pandas
                            
                                Python SimpleHTTPServer with PHP

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas apply but only for rows where a condition is met

Tags:

python

pandas

dataframe

apply

mgoldwasser

People also ask

1 Answers

jakevdp

Recent Activity

Donate For Us