Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum of several columns from a pandas dataframe

So say I have the following table:

In [2]: df = pd.DataFrame({'a': [1,2,3], 'b':[2,4,6], 'c':[1,1,1]})

In [3]: df
Out[3]: 
   a  b  c
0  1  2  1
1  2  4  1
2  3  6  1

I can sum a and b that way:

In [4]: sum(df['a']) + sum(df['b'])
Out[4]: 18

However this is not very convenient for larger dataframe, where you have to sum multiple columns together.

Is there a neater way to sum columns (similar to the below)? What if I want to sum the entire DataFrame without specifying the columns?

In [4]: sum(df[['a', 'b']]) #that will not work!
Out[4]: 18
In [4]: sum(df) #that will not work!
Out[4]: 21
like image 842
Pauline Avatar asked Oct 18 '16 19:10

Pauline


People also ask

How do you add multiple columns in a data frame?

Using DataFrame. insert() method, we can add new columns at specific position of the column name sequence. Although insert takes single column name, value as input, but we can use it repeatedly to add multiple columns to the DataFrame.

How do you sum all elements in a DataFrame?

sum() function is used to return the sum of the values for the requested axis by the user. If the input value is an index axis, then it will add all the values in a column and works same for all the columns. It returns a series that contains the sum of all the values in each column.


2 Answers

I think you can use double sum - first DataFrame.sum create Series of sums and second Series.sum get sum of Series:

print (df[['a','b']].sum())
a     6
b    12
dtype: int64

print (df[['a','b']].sum().sum())
18

You can also use:

print (df[['a','b']].sum(axis=1))
0    3
1    6
2    9
dtype: int64

print (df[['a','b']].sum(axis=1).sum())
18

Thank you pirSquared for another solution - convert df to numpy array by values and then sum:

print (df[['a','b']].values.sum())
18

print (df.sum().sum())
21
like image 131
jezrael Avatar answered Sep 21 '22 09:09

jezrael


Maybe you are looking something like this:

df["result"] = df.apply(lambda row: row['a' : 'c'].sum(),axis=1)
like image 27
Fermin Pitol Avatar answered Sep 21 '22 09:09

Fermin Pitol