Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

get dataframe row count based on conditions

Tags:

python

pandas

I want to get the count of dataframe rows based on conditional selection. I tried the following code.

print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count() 

output:

IP          57 Time        57 Method      57 Resource    57 Status      57 Bytes       57 Referrer    57 Agent       57 dtype: int64 

The output shows the count for each an every column in the dataframe. Instead I need to get a single count where all of the above conditions satisfied? How to do this? If you need more explanation about my dataframe please let me know.

like image 562
Nilani Algiriyage Avatar asked Jun 26 '13 13:06

Nilani Algiriyage


People also ask

How do I count the number of rows based on a condition in pandas?

Utilizing the Len() Method with One Condition We will first apply the condition on a single column to retrieve the number of rows that matches the condition. Then, we apply it to the multiple columns of the DataFrame. For both techniques, we utilized the “len()” method of Pandas.

How do you count rows in a DataFrame?

Get Number of Rows in DataFrame You can use len(df. index) to find the number of rows in pandas DataFrame, df. index returns RangeIndex(start=0, stop=8, step=1) and use it on len() to get the count.

Is Iterrows faster than apply?

The results show that apply massively outperforms iterrows . As mentioned previously, this is because apply is optimized for looping through dataframe rows much quicker than iterrows does. While slower than apply , itertuples is quicker than iterrows , so if looping is required, try implementing itertuples instead.


1 Answers

You are asking for the condition where all the conditions are true, so len of the frame is the answer, unless I misunderstand what you are asking

In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))  In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)] Out[18]:             A         B         C         D 12  0.491683  0.137766  0.859753 -1.041487 13  0.376200  0.575667  1.534179  1.247358 14  0.428739  1.539973  1.057848 -1.254489  In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count() Out[19]:  A    3 B    3 C    3 D    3 dtype: int64  In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)]) Out[20]: 3 
like image 52
Jeff Avatar answered Sep 18 '22 09:09

Jeff