Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Create category based on range in R



I would like to add a column to my dataframe that contains categorical data based on numbers in another column. I found a similar question at Create categorical variable in R based on range, but the solution provided there didn't provide the solution that I need. Basically, I need a result like this:

x   group
3   0-5
4   0-5
6   6-10
12  > 10

The solutions suggested using cut() and shingle(), and while those are useful for dividing the data based on ranges, they do not create the new categorical column that I need.

I have also tried using something like (please don't laugh)

data$group <- "0-5"==data[data$x>0 & data$x<5, ]

but that of course didn't work. Does anyone know how I might do this correctly?

like image 559
Thomas Avatar asked Jan 10 '14 16:01


1 Answers

Why didn't cut work? Did you not assign to a new column or something?

> data=data.frame(x=c(3,4,6,12))
> data$group = cut(data$x,c(0,5,10,15))
> data
   x   group
1  3   (0,5]
2  4   (0,5]
3  6  (5,10]
4 12 (10,15]

What you've created there is a factor object in a column of your data frame. The text displayed is the levels of the factor, and you can change them by assignment:

levels(data$group) = c("0-5","6-10",">10")
   x group
1  3   0-5
2  4   0-5
3  6  6-10
4 12   >10

Read some basic R docs on factors and you'll get it.

like image 133
Spacedman Avatar answered Sep 24 '22 03:09
