I figured out these two methods. Is there a better one?
>>> import pandas as pd >>> df = pd.DataFrame({'A': [5, 6, 7], 'B': [7, 8, 9]}) >>> print df.sum().sum() 42 >>> print df.values.sum() 42
Just want to make sure I'm not missing something more obvious.
To sum all the rows of a DataFrame, use the sum() function and set the axis value as 1. The value axis 1 will add the row values.
sum() method is used to get the sum of the values for the requested axis. level[int or level name, default None] : If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a scalar.
sum() function is used to return the sum of the values for the requested axis by the user. If the input value is an index axis, then it will add all the values in a column and works same for all the columns. It returns a series that contains the sum of all the values in each column.
pandas provides the read_csv() function to read data stored as a csv file into a pandas DataFrame . pandas supports many different file formats or data sources out of the box (csv, excel, sql, json, parquet, …), each of them with the prefix read_* .
df.to_numpy().sum()
df.values
Is the underlying numpy array
df.values.sum()
Is the numpy sum method and is faster
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With