Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

coding variable values into classes using R

I have a set of data in which I need to code values of certain variables (numeric) into 3 classes.

My data set is similar to this but has 60 more variables:

anim <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
wt <- c(181,179,180.5,201,201.5,245,246.4,189.3,301,354,369,205,199,394,231.3)
data <- data.frame(anim,wt)

> data
   anim    wt
1     1 181.0
2     2 179.0
3     3 180.5
4     4 201.0
5     5 201.5
6     6 245.0
7     7 246.4
8     8 189.3
9     9 301.0
10   10 354.0
11   11 369.0
12   12 205.0
13   13 199.0
14   14 394.0
15   15 231.3

I need to code values of the variable "wt" up into 3 classes: (wt >= 179 & wt < 200) = 1; (wt >= 200 & wt < 300) = 2; (wt > 300) = 3

which should give me this

> data2
   anim    wt SWT
1     1 181.0   1
2     2 179.0   1
3     3 180.5   1
4     4 201.0   2
5     5 201.5   2
6     6 245.0   2
7     7 246.4   2
8     8 189.3   1
9     9 301.0   3
10   10 354.0   3
11   11 369.0   3
12   12 205.0   2
13   13 199.0   1
14   14 394.0   3
15   15 231.3   2
like image 832
baz Avatar asked May 17 '11 00:05

baz


People also ask

How do you set a variable to categorical in R?

To create a categorical variable from the existing column, we use an if-else statement within the factor() function and give a value to a column if a certain condition is true otherwise give another value.

How do you convert a continuous variable to a categorical in R?

You can use the cut() function in R to create a categorical variable from a continuous one. Note that breaks specifies the values to split the continuous variable on and labels specifies the label to give to the values of the new categorical variable.

How do you assign a variable in R programming?

The variables can be assigned values using leftward, rightward and equal to operator. The values of the variables can be printed using print() or cat() function. The cat() function combines multiple items into a continuous print output.

How do I turn a variable into a factor in R?

In R, you can convert multiple numeric variables to factor using lapply function. The lapply function is a part of apply family of functions. They perform multiple iterations (loops) in R. In R, categorical variables need to be set as factor variables.


2 Answers

The cut method as outlined by @Greg is probably what you want here. One thing to note is that cut returns a factor by default, which you can suppress by supplying labels = FALSE to return the integer values:

cut(data$wt, c(178, 200, 300, Inf), labels = FALSE)

Alternatively, if your cutting does not lend itself to natural breaks, you can use ifelse(). You can "nest" the ifelse statements similar to Excel. I use "with" to cut down on the typing needed:

data$group2 <- with(data, ifelse(wt >= 179 & wt < 200, 1, 
  ifelse(wt >= 200 & wt < 300, 2, 3))
)
like image 104
Chase Avatar answered Sep 28 '22 06:09

Chase


You can try cut

anim <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) 
wt <-c(181,179,180.5,201,201.5,245,246.4,
189.3,301,354,369,205,199,394,231.3) 
data <- data.frame(anim,wt)

EDIT: fixed group - right = FALSE, got rid of split example.

group = cut(data$wt, c(178, 200, 300, Inf), right=FALSE)


data$swt = as.numeric(group)
data
   anim    wt swt
1     1 181.0   1
2     2 179.0   1
3     3 180.5   1
4     4 201.0   2
5     5 201.5   2
6     6 245.0   2
7     7 246.4   2
8     8 189.3   1
9     9 301.0   3
10   10 354.0   3
11   11 369.0   3
12   12 205.0   2
13   13 199.0   1
14   14 394.0   3
15   15 231.3   2
> 
like image 28
Greg Avatar answered Sep 28 '22 04:09

Greg