Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas dataframe total row

Tags:

python

pandas

I have a dataframe, something like:

     foo  bar  qux 0    a    1    3.14 1    b    3    2.72 2    c    2    1.62 3    d    9    1.41 4    e    3    0.58 

and I would like to add a 'total' row to the end of dataframe:

     foo  bar  qux 0    a    1    3.14 1    b    3    2.72 2    c    2    1.62 3    d    9    1.41 4    e    3    0.58 5  total  18   9.47 

I've tried to use the sum command but I end up with a Series, which although I can convert back to a Dataframe, doesn't maintain the data types:

tot_row = pd.DataFrame(df.sum()).T tot_row['foo'] = 'tot' tot_row.dtypes:      foo    object      bar    object      qux    object 

I would like to maintain the data types from the original data frame as I need to apply other operations to the total row, something like:

baz = 2*tot_row['qux'] + 3*tot_row['bar'] 
like image 443
Daniel Avatar asked Feb 13 '14 11:02

Daniel


People also ask

How do I add a total row to a data frame?

To get the total or sum of a column use sum() method, and to add the result of the sum as a row to the DataFrame use loc[] , at[] , append() and pandas. Series() methods.

Where is total count in pandas?

Pandas DataFrame count() Method The count() method counts the number of not empty values for each row, or column if you specify the axis parameter as axis='columns' , and returns a Series object with the result for each row (or column).


2 Answers

Append a totals row with

df.append(df.sum(numeric_only=True), ignore_index=True) 

The conversion is necessary only if you have a column of strings or objects.

It's a bit of a fragile solution so I'd recommend sticking to operations on the dataframe, though. eg.

baz = 2*df['qux'].sum() + 3*df['bar'].sum() 
like image 81
jmz Avatar answered Oct 04 '22 09:10

jmz


df.loc["Total"] = df.sum() 

works for me and I find it easier to remember. Am I missing something? Probably wasn't possible in earlier versions.

I'd actually like to add the total row only temporarily though. Adding it permanently is good for display but makes it a hassle in further calculations.

Just found

df.append(df.sum().rename('Total')) 

This prints what I want in a Jupyter notebook and appears to leave the df itself untouched.

like image 21
Matthias Kauer Avatar answered Oct 04 '22 10:10

Matthias Kauer