Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create sum of columns in Pandas based on a conditional of multiple columns?

I am trying to sum two columns of the DataFrame to create a third column where the value in the third column is equal to the sum of the positive elements of the other columns. I have tried the below and just receive a column of NaN values

df = pd.DataFrame(np.array([[-1, 2], [-2, 2], [1, -3], [1, -4], [ -2 , -2]]),
                   columns=['a', 'b'])

df['Sum of Positives'] = 0

df['Sum of Positives'] = df.loc[df.a > 0 ,'a'] +df.loc[df.b >0 , 'b']

DataFrame:

enter image description here

like image 602
KaneM Avatar asked Nov 14 '20 14:11

KaneM


People also ask

How do I get the sum of multiple columns in Pandas?

Sum all columns in a Pandas DataFrame into new column If we want to summarize all the columns, then we can simply use the DataFrame sum() method.

How do you group by and sum multiple columns in Pandas?

Use DataFrame. groupby(). sum() to group rows based on one or multiple columns and calculate sum agg function. groupby() function returns a DataFrameGroupBy object which contains an aggregate function sum() to calculate a sum of a given column for each group.

How do I sum a range of columns in Pandas?

To sum given or list of columns then create a list with all columns you wanted and slice the DataFrame with the selected list of columns and use the sum() function. Use df['Sum']=df[col_list]. sum(axis=1) to get the total sum.

How to sum more than two columns of a pandas Dataframe?

Sum of two or more columns of pandas dataframe in python is carried out using + operator. Lets see how to Sum the two columns of a pandas dataframe in python Sum more than two columns of a pandas dataframe in python view source print?

How do I sum the values of the rows in Dataframe?

The sum_stats column contains the sum of the row values across all columns. And so on. The following code shows how to sum the values of the rows across all columns in the DataFrame: The sum_stats column contains the sum of the row values across the ‘points’ and ‘assists’ columns. And so on.

How to add values column-wise in pandas?

By default, Pandas will apply an axis=0 argument, which will add up values index-wise. If we can change this to axis=1, values will be added column-wise.

How do you filter DataFrames in pandas?

Pandas’ loc creates a boolean mask, based on a condition. Sometimes, that condition can just be selecting rows and columns, but it can also be used to filter dataframes. These filtered dataframes can then have values applied to them.


2 Answers

You can use df.mask here and fill value less than 0 i.e negative value with 0 and do df.sum over axis 1.

df['sum of pos'] = df.mask(df<0, 0).sum(axis=1)

   a  b  sum of pos
0 -1  2           2
1 -2  2           2
2  1 -3           1
3  1 -4           1
4 -2 -2           0

Few NumPy hacks that are useful here.

  • Using np.copyto

    t = np.copy(df.values)
    np.copyto(t, 0, where=df.values<0)
    df['sum of pos'] = t.sum(axis=1)
    
  • Using np.where

    df['sum of pos'] = np.where(df.values<0, 0, df.values).sum(axis=1)
    
  • Using np.clip

    df['sum of pos'] = np.clip(df.values, 0, None).sum(axis=1)
    
  • Using np.ma.array

    m = np.ma.array(df.values, mask=df.values<0, fill_value=0)
    df['sum of pos'] = m.filled().sum(axis=1)
    
like image 200
Ch3steR Avatar answered Oct 02 '22 12:10

Ch3steR


You can use apply, and subset on the positives:

df['Sum of Positives']  = df.apply(lambda x:sum(x[x>0]),axis=1)

    a   b   Sum of Positives
0   -1  2   2
1   -2  2   2
2    1 -3   1
3    1 -4   1
4   -2 -2   0
like image 31
StupidWolf Avatar answered Oct 02 '22 13:10

StupidWolf