Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas cut method excludes lower bound

Tags:

python

pandas

I am trying to bin a dataframe column that contains ages in the range 0 to 100. When I try and use a bin to include the zero ages it does not work.

Here is an demo using a list with the range of my data:

pd.cut(pd.Series(range(101)), [0, 24, 49, 74, 100])

The zero value in the range returns NaN from the cut.

Any way around this?

like image 567
John Avatar asked Mar 03 '16 05:03

John


Video Answer


1 Answers

IIUC you need to set include_lowest argument to True. From docs:

include_lowest : bool
Whether the first interval should be left-inclusive or not.

For your case:

pd.cut(pd.Series(range(101)), [0,24,49,74,100], include_lowest=True)

In [148]: pd.cut(pd.Series(range(101)), [0,24,49,74,100], include_lowest=True).head(10)
Out[148]:
0    [0, 24]
1    [0, 24]
2    [0, 24]
3    [0, 24]
4    [0, 24]
5    [0, 24]
6    [0, 24]
7    [0, 24]
8    [0, 24]
9    [0, 24]
dtype: category
Categories (4, object): [[0, 24] < (24, 49] < (49, 74] < (74, 100]]
like image 161
Anton Protopopov Avatar answered Sep 21 '22 15:09

Anton Protopopov