Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change numerical Data to Categorical Data - Pandas [duplicate]

I have a pandas dataframe which has a numerical column "amount". The amount varies from 0 to 20000. I want to change it into categorical variable which defines a range. So, the categorical variable would be :

  1. Between 0-1000$
  2. Between 1000-2000$ and so on.. till 19000-20000$

I am unable to figure out how to change the column. I can change it to a binary values like this :

months["value"] = np.where(months['amount']>=450, 'yes', 'no') 

But, how to do it for categorical variable having more than 2 values?

like image 665
Dreams Avatar asked Oct 10 '17 05:10

Dreams


People also ask

How does pandas get categorical data?

Using the standard pandas Categorical constructor, we can create a category object. Here, the second argument signifies the categories. Thus, any value which is not present in the categories will be treated as NaN. Logically, the order means that, a is greater than b and b is greater than c.


1 Answers

You can use cut:

df = pd.DataFrame({'B':[4000,5000,4000,9000,5,11040]})

df['D'] = pd.cut(df['B'], range(0, 21000, 1000))
print (df)
       B               D
0   4000    (3000, 4000]
1   5000    (4000, 5000]
2   4000    (3000, 4000]
3   9000    (8000, 9000]
4      5       (0, 1000]
5  11040  (11000, 12000]
like image 69
jezrael Avatar answered Nov 14 '22 23:11

jezrael