I am trying to create a program that will delete a column in a Panda's dataFrame if the column's sum is less than 10.
I currently have the following solution, but I was curious if there is a more pythonic way to do this.
df = pandas.DataFrame(AllData)
sum = df.sum(axis=1)
badCols = list()
for index in range(len(sum)):
if sum[index] < 10:
badCols.append(index)
df = df.drop(df.columns[badCols], axis=1)
In my approach, I create a list of column indexes that have sums less than 10, then I delete this list. Is there a better approach for doing this?
Basically, the SUMIFS function is designed to add up a column of numbers, but, include only those rows that meet one or more conditions. Each condition is defined by a pair of arguments. Here is the basic syntax: =SUMIFS(sum_range, criteria_range1, criteria1, ...)
To find the sum value in a column that matches a given condition, we will use pandas. DataFrame. loc property and sum() method, first, we will check the condition if the value of 1st column matches a specific condition, then we will collect these values and apply the sum() method.
Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).
You can call sum
to generate a Series
that gives the sum of each column, then use this to generate a boolean mask against your column array and use this to filter the df. DF generation code borrowed from @Alexander:
In [2]:
df = pd.DataFrame({'a': [1, 10], 'b': [1, 1], 'c': [20, 30]})
df
Out[2]:
a b c
0 1 1 20
1 10 1 30
In [3]:
df.sum()
Out[3]:
a 11
b 2
c 50
dtype: int64
In [6]:
df[df.columns[df.sum()>10]]
Out[6]:
a c
0 1 20
1 10 30
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With