Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas calculate number of values between each range

Tags:

python

pandas

I want to find counts of my data between certain custom ranges.

Say I have some data:

import random

my_randoms = random.sample(xrange(100), 10)        
test = pd.DataFrame(my_randoms,columns = ["x"])

How can I produce a data frame that shows the number of values between different ranges? For example, say I want to see how many values occur between 0-19, 20-39, 40-59, 60-79, 80-100. The output dataframe will have one column with those ranges, another with the counts.

I can think of some ugly approaches that involve use of .apply to get a new column list saying which value they are between (and then doing a groupby), but I suspect pandas has a cleaner way lurking about.

like image 200
AZhao Avatar asked Jan 27 '16 20:01

AZhao


2 Answers

Per Jarad's link to that other question:

test.groupby(pd.cut(test['x'], np.arange(0,100,20))).count()
like image 191
AZhao Avatar answered Nov 09 '22 21:11

AZhao


there's probably a better way. I'm only new to pandas myself but how about this for the moment:

test.query(test.x.isin(range(20)))
like image 37
Gregory Kuhn Avatar answered Nov 09 '22 20:11

Gregory Kuhn