Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas binning a list based on qcut of another list

say I have a list:

a = [3, 5, 1, 1, 3, 2, 4, 1, 6, 4, 8]

and a sub list of a:

b = [5, 2, 6, 8]

I'd like to obtain bins by pd.qcut(a,2) and count number of values in each bin for list b. That is

In[84]: pd.qcut(a,2)
Out[84]: 
Categorical: 
[[1, 3], (3, 8], [1, 3], [1, 3], [1, 3], [1, 3], (3, 8], [1, 3], (3, 8], (3, 8], (3, 8]]
Levels (2): Index(['[1, 3]', '(3, 8]'], dtype=object)

Now I know the bins are: [1,3] and (3,8], and I'd like to know how many values in each bin for list "b". I can do this by hand when the number of bins is small, but what's the best approach when the number of bins is large?

like image 624
user2921752 Avatar asked Jan 02 '14 22:01

user2921752


1 Answers

You can use retbins paramether to get bins back from qcut:

>>> q, bins = pd.qcut(a, 2, retbins=True)

Then use pd.cut to get b indices with respect to bins:

>>> b = np.array(b)
>>> hist = pd.cut(b, bins, right=True).labels
>>> hist[b==bins[0]] = 0
>>> hist
array([1, 0, 1, 1])

Note that you have to treat corner case, bins[0], separately, as it is not included by cut in leftmost bin.

like image 77
alko Avatar answered Sep 24 '22 00:09

alko