<p>want to calculate C based on values of count, A and B</p> <p>sample df:</p> <div class="s-table-container"> <table class="s-table"> <thead><tr> <th>count</th> <th>A</th> <th>B</th> <th>C</th> </tr></thead> <tbody> <tr> <td>yes</td> <td>23</td> <td>2</td> <td>nan</td> </tr> <tr> <td>nan</td> <td>23</td> <td>1</td> <td>nan</td> </tr> <tr> <td>yes</td> <td>41</td> <td>6</td> <td>nan</td> </tr> </tbody> </table> </div> <p>result I want</p> <div class="s-table-container"> <table class="s-table"> <thead><tr> <th>count</th> <th>A</th> <th>B</th> <th>C</th> </tr></thead> <tbody> <tr> <td>yes</td> <td>23</td> <td>2</td> <td>46</td> </tr> <tr> <td>nan</td> <td>23</td> <td>1</td> <td>0</td> </tr> <tr> <td>yes</td> <td>41</td> <td>6</td> <td>246</td> </tr> </tbody> </table> </div> <p>calculate C = A*B only when count value = yes otherwise C values =0 that is, it should skip nan values of count</p> <p>Any help is appreciable</p> <p>I am trying this</p> <pre class="prettyprint"><code>for ind, row in df.iterrows(): if df['count'] == 'yes': df.loc[ ind, 'C'] =row['A'] *row['B'] else: df.loc[ ind, 'C'] =0 </code></pre> <p>But it's giving error : ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().</p>

<p>pandas overloads <code>*</code> for this operation, provided you correctly specify the indices you want to set:</p> <pre class="prettyprint"><code>mask = df["count"].notna() df.loc[mask, "C"] = df["A"]*df["B"] df.C.fillna(0, inplace=True) </code></pre> <p>or a slightly more concise version that would annoy your coworkers:</p> <pre class="prettyprint"><code>df["C"] = df["A"]*df["B"]*(df["count"].notna()) </code></pre> <p>In the last, <code>df["count"].notna()</code> returns a boolean column, which is converted to a numeric type when multiplied by numerical columns. Concise but as clear.</p> <p>output for either:</p> <pre class="prettyprint"><code> count A B C 0 yes 23 2 46.0 1 NaN 23 1 0 2 yes 41 6 246.0 </code></pre> <p>This will be more performant than <code>.apply</code> and <em>much</em> more performant than iterrows.</p>

<p>Just use this:-</p> <pre class="prettyprint"><code>df['C']=df[df['count']=='yes']['C'].fillna(value=df['A']*df['B']) df['C']=df['C'].fillna(0) </code></pre> <p>Try this:-</p> <pre class="prettyprint"><code>for ind, row in df.iterrows(): if row['count'] == 'yes': df.loc[ ind, 'C'] =row['A'] *row['B'] else: df.loc[ ind, 'C'] =0 </code></pre> <p>You are getting error because you write <code>df['count']=='yes'</code> instead of <code>row['count'] == 'yes'</code></p>

iterate over pandas columns based on conditions

Q: What is the fastest way to iterate over pandas DataFrame?

Vectorization is always the first and best choice. You can convert the data frame to NumPy array or into dictionary format to speed up the iteration workflow. Iterating through the key-value pair of dictionaries comes out to be the fastest way with around 280x times speed up for 20 million records.

Q: How to iterate over columns in pandas Dataframe?

How to Iterate Over Columns in Pandas DataFrame. You can use the following basic syntax to iterate over columns in a pandas DataFrame: for name, values indf.iteritems(): print(values) The following examples show how to use this syntax in practice with the following pandas DataFrame:

Q: How to apply an IF condition in pandas Dataframe?

Applying an IF condition in Pandas DataFrame. Let’s now review the following 5 cases: (1) IF condition – Set of numbers. Suppose that you created a DataFrame in Python that has 10 numbers (from 1 to 10). You then want to apply the following IF conditions: If the number is equal or lower than 4, then assign the value of ‘True’

Q: How do you filter DataFrames in pandas?

Pandas’ loc creates a boolean mask, based on a condition. Sometimes, that condition can just be selecting rows and columns, but it can also be used to filter dataframes. These filtered dataframes can then have values applied to them.

Q: How to iterate over rows using iteritems () function in Python?

Iteration over rows using iteritems () In order to iterate over rows, we use iteritems () function this function iterates over each column as key, value pair with label as key and column value as a Series object. Code #1: import pandas as pd

Tags:

python

pandas

dataframe

numpy

want to calculate C based on values of count, A and B

sample df:

count	A	B	C
yes	23	2	nan
nan	23	1	nan
yes	41	6	nan

result I want

count	A	B	C
yes	23	2	46
nan	23	1	0
yes	41	6	246

calculate C = A*B only when count value = yes otherwise C values =0 that is, it should skip nan values of count

Any help is appreciable

I am trying this

for ind, row in df.iterrows():
    if df['count'] == 'yes':
        df.loc[ ind, 'C'] =row['A'] *row['B']
    else:
        df.loc[ ind, 'C'] =0

But it's giving error : ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

595

asked Mar 06 '21 04:03

Priya Chauhan

3 Answers

Another option:

df.C = df.A.mul(df.B).where(df['count'].eq('yes')).fillna(0)

df
#  count   A  B      C
#0   yes  23  2   46.0
#1   NaN  23  1    0.0
#2   yes  41  6  246.0

Or if you prefer operators: df.C = (df.A * df.B).where(df['count'] == 'yes').fillna(0)

199

answered Oct 07 '22 13:10

Psidom

pandas overloads * for this operation, provided you correctly specify the indices you want to set:

mask = df["count"].notna()
df.loc[mask, "C"] = df["A"]*df["B"]
df.C.fillna(0, inplace=True)

or a slightly more concise version that would annoy your coworkers:

df["C"] = df["A"]*df["B"]*(df["count"].notna())

In the last, df["count"].notna() returns a boolean column, which is converted to a numeric type when multiplied by numerical columns. Concise but as clear.

output for either:

  count   A  B      C
0   yes  23  2   46.0
1   NaN  23  1      0
2   yes  41  6  246.0

This will be more performant than .apply and much more performant than iterrows.

answered Oct 07 '22 15:10

anon01

Just use this:-

df['C']=df[df['count']=='yes']['C'].fillna(value=df['A']*df['B'])
df['C']=df['C'].fillna(0)

Try this:-

for ind, row in df.iterrows():
    if row['count'] == 'yes':
        df.loc[ ind, 'C'] =row['A'] *row['B']
    else:
        df.loc[ ind, 'C'] =0

You are getting error because you write df['count']=='yes' instead of row['count'] == 'yes'

answered Oct 07 '22 15:10

Anurag Dabas

Related questions
                            
                                i have accidently delete my sceret key form settings.py in django
                            
                                AttributeError: module 'keras.backend' has no attribute 'common'
                            
                                AttributeError: 'Magic' object has no attribute 'cookie' while using python-jira
                            
                                Python 3.9 pip install
                            
                                Pandas .loc and PEP8
                            
                                Unordered list as dict key [closed]
                            
                                Plotting a gaussian fit to histgram in seaborn displot/histplot function (NOT distplot)
                            
                                Upload via the Youtube via api set to Private (locked)
                            
                                tqdm - multiple progress bars with nested for loops in PyCharm
                            
                                Seaborn lineplot logarithmic scale
                            
                                Can we make the ML model (pickle file) more robust, by accepting (or ignoring) new features?
                            
                                Python NEAT not learning further after a certain point
                            
                                How to change the color of points with special conditions in the dataframe
                            
                                What is a python/pandas equivalent to R's `with`?
                            
                                Pandas rows multiple rows as one, adding specific column
                            
                                Python 3.9 and Pycharm, HTMLParser AttributeError
                            
                                Do I need to use `nogil` in Cython
                            
                                Double list comprehension for occurrences of a string in a list of strings
                            
                                How to rotate an image to align the text for extraction?
                            
                                What is the meaning of 'for _ in range() [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With