say I have a list:
a = [3, 5, 1, 1, 3, 2, 4, 1, 6, 4, 8]
and a sub list of a:
b = [5, 2, 6, 8]
I'd like to obtain bins by pd.qcut(a,2)
and count number of values in each bin for list b. That is
In[84]: pd.qcut(a,2)
Out[84]:
Categorical:
[[1, 3], (3, 8], [1, 3], [1, 3], [1, 3], [1, 3], (3, 8], [1, 3], (3, 8], (3, 8], (3, 8]]
Levels (2): Index(['[1, 3]', '(3, 8]'], dtype=object)
Now I know the bins are: [1,3] and (3,8], and I'd like to know how many values in each bin for list "b". I can do this by hand when the number of bins is small, but what's the best approach when the number of bins is large?
You can use retbins paramether to get bins back from qcut:
>>> q, bins = pd.qcut(a, 2, retbins=True)
Then use pd.cut
to get b
indices with respect to bins:
>>> b = np.array(b)
>>> hist = pd.cut(b, bins, right=True).labels
>>> hist[b==bins[0]] = 0
>>> hist
array([1, 0, 1, 1])
Note that you have to treat corner case, bins[0]
, separately, as it is not included by cut in leftmost bin.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With