So say I have the following table:
In [2]: df = pd.DataFrame({'a': [1,2,3], 'b':[2,4,6], 'c':[1,1,1]})
In [3]: df
Out[3]:
a b c
0 1 2 1
1 2 4 1
2 3 6 1
I can sum a and b that way:
In [4]: sum(df['a']) + sum(df['b'])
Out[4]: 18
However this is not very convenient for larger dataframe, where you have to sum multiple columns together.
Is there a neater way to sum columns (similar to the below)? What if I want to sum the entire DataFrame without specifying the columns?
In [4]: sum(df[['a', 'b']]) #that will not work!
Out[4]: 18
In [4]: sum(df) #that will not work!
Out[4]: 21
Using DataFrame. insert() method, we can add new columns at specific position of the column name sequence. Although insert takes single column name, value as input, but we can use it repeatedly to add multiple columns to the DataFrame.
sum() function is used to return the sum of the values for the requested axis by the user. If the input value is an index axis, then it will add all the values in a column and works same for all the columns. It returns a series that contains the sum of all the values in each column.
I think you can use double sum
- first DataFrame.sum
create Series
of sums and second Series.sum
get sum of Series
:
print (df[['a','b']].sum())
a 6
b 12
dtype: int64
print (df[['a','b']].sum().sum())
18
You can also use:
print (df[['a','b']].sum(axis=1))
0 3
1 6
2 9
dtype: int64
print (df[['a','b']].sum(axis=1).sum())
18
Thank you pirSquared for another solution - convert df
to numpy array
by values
and then sum
:
print (df[['a','b']].values.sum())
18
print (df.sum().sum())
21
Maybe you are looking something like this:
df["result"] = df.apply(lambda row: row['a' : 'c'].sum(),axis=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With