Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas: How do I select rows based on the sum of all columns?

Tags:

python

pandas

How do I select rows based on the sum of the columns in pandas? Let's say I want to select all rows where the sum of the columns are greater than 0.

like image 573
daemonk Avatar asked Dec 26 '22 08:12

daemonk


1 Answers

Use sum and set axis=1 param

In [59]:

df = pd.DataFrame({'a':randn(10), 'b':randn(10), 'c':randn(10)})
df
Out[59]:
          a         b         c
0 -0.196883 -0.749798  0.321718
1 -0.472434  1.465179 -0.264934
2  0.131524  1.071453  1.575231
3 -2.940246 -1.532570 -0.635035
4  1.037159 -0.466863  0.535814
5 -1.924729  1.421148 -0.193244
6 -0.443746 -0.019479  1.192575
7 -0.963762  0.575936  1.699024
8 -0.244891 -0.348923 -0.198269
9  0.190444 -1.505409 -1.166708

[10 rows x 3 columns]
In [62]:

df[df.sum(axis=1) > 1]

Out[62]:
          a         b         c
2  0.131524  1.071453  1.575231
4  1.037159 -0.466863  0.535814
7 -0.963762  0.575936  1.699024

[3 rows x 3 columns]

In my above sample I use a selection criteria of 1 but you can just subsitute that with 0 so df[df.sum(axis=1) > 0] in your case

like image 145
EdChum Avatar answered Jan 18 '23 23:01

EdChum