Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best way to sum all values in a Pandas dataframe?

I figured out these two methods. Is there a better one?

>>> import pandas as pd >>> df = pd.DataFrame({'A': [5, 6, 7], 'B': [7, 8, 9]}) >>> print df.sum().sum() 42 >>> print df.values.sum() 42 

Just want to make sure I'm not missing something more obvious.

like image 691
Bill Avatar asked Aug 03 '16 02:08

Bill


People also ask

How do I sum all rows in a DataFrame?

To sum all the rows of a DataFrame, use the sum() function and set the axis value as 1. The value axis 1 will add the row values.

How do you sum all values in a series in Python?

sum() method is used to get the sum of the values for the requested axis. level[int or level name, default None] : If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a scalar.

How do you sum DataFrame in Python?

sum() function is used to return the sum of the values for the requested axis by the user. If the input value is an index axis, then it will add all the values in a column and works same for all the columns. It returns a series that contains the sum of all the values in each column.

Which is the best way to get data in pandas?

pandas provides the read_csv() function to read data stored as a csv file into a pandas DataFrame . pandas supports many different file formats or data sources out of the box (csv, excel, sql, json, parquet, …), each of them with the prefix read_* .


1 Answers

Updated for Pandas 0.24+

df.to_numpy().sum() 

Prior to Pandas 0.24+

df.values 

Is the underlying numpy array

df.values.sum() 

Is the numpy sum method and is faster

like image 184
piRSquared Avatar answered Sep 18 '22 14:09

piRSquared