Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Summing all elements in a dataframe? [duplicate]

Tags:

python

pandas

Given a Pandas dataframe df, we can sum the columns like this

[x for x in df.sum()]

and produce the sum of sums like this.

sum([x for x in df.sum()])

Can this be done using only dataframe operations, without resorting to Python's sum()?

like image 710
Mark Harrison Avatar asked Oct 06 '20 00:10

Mark Harrison


People also ask

How do you sum all values in a DataFrame Pandas?

sum() function is used to return the sum of the values for the requested axis by the user. If the input value is an index axis, then it will add all the values in a column and works same for all the columns. It returns a series that contains the sum of all the values in each column.

How do I get the sum of all columns in Pandas?

sum() to Sum All Columns. Use DataFrame. sum() to get sum/total of a DataFrame for both rows and columns, to get the total sum of columns use axis=1 param. By default, this method takes axis=0 which means summing of rows.

How do I sum a Dataframe in pandas?

Pandas dataframe.sum () function returns the sum of the values for the requested axis. Summing all the rows of a Dataframe using the sum function and setting the axis value to 1 for summing up the row values and displaying the result as output.

How to extract a single row from a pandas Dataframe?

The extracted rows are called slices and contain all the columns. The easiest way to extract a single row is to use the row index inside the .iloc attribute. The general syntax is: The output is a Pandas Series which contains the row values. The appearance is a bit confusing as the output is a Pandas Series.

How to sum all the rows of a Dataframe in R?

Summing all the rows of a Dataframe using the sum function and setting the axis value to 1 for summing up the row values and displaying the result as output. Summing all the rows or some rows of the Dataframe as per requirement using loc function and the sum function and setting the axis to 1 for summing up rows.

What is a Dataframe in Python?

A Dataframe is a 2-dimensional data structure in form of a table with rows and columns. It can be created by loading the datasets from existing storage, storage can be SQL Database, CSV file, an Excel file, or from a python list or dictionary as well. Pandas dataframe.sum () function returns the sum of the values for the requested axis.


Video Answer


2 Answers

We can do stack

df.stack().sum()
like image 73
BENY Avatar answered Nov 02 '22 04:11

BENY


Use np.sum:

np.sum(df.to_numpy())

or as @jakub points out:

df.to_numpy().sum()

Timings:

Using...

df = pd.DataFrame(np.arange(10000).reshape(100,-1))

%timeit df.to_numpy().sum()
# 12.1 µs ± 357 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.sum(df.to_numpy())
# 14 µs ± 263 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit df.stack().sum()
# 469 µs ± 30.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit df.sum().sum()
# 381 µs ± 21.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
like image 45
Scott Boston Avatar answered Nov 02 '22 05:11

Scott Boston