Suppose I have a dataset with a categorical variable X
that takes the values A
, B
, or C
.
I want to create a new variable Y
that is
X
= A
;X
= B
;X
= C
.Here is what I have so far, and I know that it is incorrect.
if(X==A) {
(Y = 1)
}
else if(X==B) {
(Y = 2)
}
else {
(Y = 3)
}
I keep getting the error:
Object 'Y' not found
How do I create the variable Y
such that it can take on these new values based on the values of X
?
To create a new variable choose a name for the new variable, use a data step, and then define it based on already existing variables using the equals sign (=). run; The data set "w" has three variables, height, weight, and bmi.
The basic method of adding information to a SAS data set is to create a new variable in a DATA step with an assignment statement. An assignment statement has the form: variable=expression; The variable receives the new information; the expression creates the new information.
To create a new variable or to transform an old variable into a new one, usually, is a simple task in R. The common function to use is newvariable <- oldvariable . Variables are always added horizontally in a data frame.
Option 1: Take the numeric values of the factor.
X
# [1] "B" "C" "A" "C" "A" "C" "B" "B" "A" "A"
c(factor(X))
# [1] 2 3 1 3 1 3 2 2 1 1
c()
drops attributes, and is used for general fanciness. as.numeric()
might be more readable.
Option 2: A lookup vector.
c(A = 1, B = 2, C = 3)[X]
# B C A C A C B B A A
# 2 3 1 3 1 3 2 2 1 1
Data:
set.seed(25)
X <- sample(LETTERS[1:3], 10, TRUE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With