Here is the snippet: <pre class="prettyprint"><code>test = pd.DataFrame({'days': [0,31,45]}) test['range'] = pd.cut(test.days, [0,30,60]) </code></pre> Output: <pre class="prettyprint"><code> days range 0 0 NaN 1 31 (30, 60] 2 45 (30, 60] </code></pre> I am surprised that 0 is not in (0, 30], what should I do to categorize 0 as (0, 30]?

<code>pd.cut</code> documentation Include parameter <code>right=False</code> <pre class="prettyprint"><code>test = pd.DataFrame({'days': [0,31,45]}) test['range'] = pd.cut(test.days, [0,30,60], right=False) test days range 0 0 [0, 30) 1 31 [30, 60) 2 45 [30, 60) </code></pre>

Pandas how to use pd.cut()

Tags:

python

pandas

Here is the snippet:

test = pd.DataFrame({'days': [0,31,45]})
test['range'] = pd.cut(test.days, [0,30,60])

Output:

    days    range
0   0       NaN
1   31      (30, 60]
2   45      (30, 60]

I am surprised that 0 is not in (0, 30], what should I do to categorize 0 as (0, 30]?

430

asked Aug 18 '17 07:08

Cheng

4 Answers

test['range'] = pd.cut(test.days, [0,30,60], include_lowest=True) print (test)    days           range 0     0  (-0.001, 30.0] 1    31    (30.0, 60.0] 2    45    (30.0, 60.0]

See difference:

test = pd.DataFrame({'days': [0,20,30,31,45,60]})  test['range1'] = pd.cut(test.days, [0,30,60], include_lowest=True) #30 value is in [30, 60) group test['range2'] = pd.cut(test.days, [0,30,60], right=False) #30 value is in (0, 30] group test['range3'] = pd.cut(test.days, [0,30,60]) print (test)    days          range1    range2    range3 0     0  (-0.001, 30.0]   [0, 30)       NaN 1    20  (-0.001, 30.0]   [0, 30)   (0, 30] 2    30  (-0.001, 30.0]  [30, 60)   (0, 30] 3    31    (30.0, 60.0]  [30, 60)  (30, 60] 4    45    (30.0, 60.0]  [30, 60)  (30, 60] 5    60    (30.0, 60.0]       NaN  (30, 60]

Or use numpy.searchsorted, but values of days has to be sorted:

arr = np.array([0,30,60]) test['range1'] = arr.searchsorted(test.days) test['range2'] = arr.searchsorted(test.days, side='right') - 1 print (test)    days  range1  range2 0     0       0       0 1    20       1       0 2    30       1       1 3    31       2       1 4    45       2       1 5    60       2       2

121

answered Oct 03 '22 10:10

jezrael

pd.cut documentation
Include parameter right=False

test = pd.DataFrame({'days': [0,31,45]}) test['range'] = pd.cut(test.days, [0,30,60], right=False)  test     days     range 0     0   [0, 30) 1    31  [30, 60) 2    45  [30, 60)

answered Oct 03 '22 11:10

piRSquared

You can use labels to pd.cut() as well. The following example contains the grade of students in the range from 0-10. We're adding a new column called 'grade_cat' to categorize the grades.

bins represent the intervals: 0-4 is one interval, 5-6 is one interval, and so on The corresponding labels are "poor", "normal", etc

bins = [0, 4, 6, 10]
labels = ["poor","normal","excellent"]
student['grade_cat'] = pd.cut(student['grade'], bins=bins, labels=labels)

answered Oct 03 '22 11:10

Mino De Raj

A sample of how the .cut works

s=pd.Series([168,180,174,190,170,185,179,181,175,169,182,177,180,171])
    pd.cut(s,3)
    #To add labels to bins
    pd.cut(s,3,labels=["Small","Medium","Large"])

This can be used directly on a range

answered Oct 03 '22 11:10

nashtgc

Related questions
                            
                                Seaborn multiple barplots
                            
                                Pickle: TypeError: a bytes-like object is required, not 'str' [duplicate]
                            
                                How to extend/concatenate two iterators in Python [duplicate]
                            
                                How to do superscripts and subscripts in Jupyter Notebook?
                            
                                Python Sqlite3: INSERT INTO table VALUE(dictionary goes here)
                            
                                Get locals from calling namespace in Python
                            
                                The most Pythonic way of checking if a value in a dictionary is defined/has zero length
                            
                                How can I require my python script's argument to be a float between 0.0-1.0 using argparse?
                            
                                How To Get All The Contiguous Substrings Of A String In Python?
                            
                                use conda environment in sublime text 3
                            
                                casting raw strings python [duplicate]
                            
                                SQLAlchemy ON DUPLICATE KEY UPDATE
                            
                                Getting S3 objects' last modified datetimes with boto
                            
                                Django setting : psycopg2.OperationalError: FATAL: Peer authentication failed for user "indivo"
                            
                                Django Rest Framework ImageField
                            
                                Cyclic colormap without visual distortions for use in phase angle plots?
                            
                                RSA encryption and decryption in Python
                            
                                Python json.dumps(<val>) to output minified json?
                            
                                Use None instead of np.nan for null values in pandas DataFrame
                            
                                How to view database and schema of django sqlite3 db

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas how to use pd.cut()

Tags:

python

pandas

Cheng

People also ask

4 Answers

jezrael

piRSquared

Mino De Raj

nashtgc

Recent Activity

Donate For Us