Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hide Labels from pandas.cut customised Interval Index

Tags:

pandas

I applied pandas.cut on Series. if I don't use customised Interval Index, label=False works as expected (returns only integer indicators of the bins). However, when I used customised Interval Index, even I set label=False, it returns interval of each bin. I guess this is probably because I used interval index in stead of number of bins.

Is there anyway to use customised interval index, also only return integer indicator of the bins?

bins = pd.interval_range(start=0, end=10, periods=5, closed='left')
pd.cut([1, 3, 5, 7, 9], bins, labels=False)
like image 964
Osca Avatar asked Dec 27 '25 16:12

Osca


1 Answers

codes attribute

pd.cut([1, 3, 5, 7, 9], bins, labels=False).codes

array([0, 1, 2, 3, 4], dtype=int8)

The pd.cut function returns a Categorical type object. What gets displayed are the various categories for each element. However, the Categorical object has two attributes codes and categories. The categories are what you'd think. It is an array of your unique categories in the proper order. The codes are the positions of that categories array that each element of the Categorical object is referencing.

You can produce the Categorical values by slicing the categories array with the codes array like so:

mycut = pd.cut([1, 3, 5, 7, 9], bins, labels=False)

mycut.categories[mycut.codes]

IntervalIndex([[0, 2), [2, 4), [4, 6), [6, 8), [8, 10)],
              closed='left',
              dtype='interval[int64]')

However, the codes are the exact thing you were looking for... so take it.

like image 173
piRSquared Avatar answered Dec 31 '25 19:12

piRSquared



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!