I am trying to generate dummy variables (must be 1/0) using a loop based on the most frequent response of a variable. After lots of googling, I haven't managed to come up with a solution. I have extracted the most frequent responses (strings, say the top 5 are "A","B",...,"E") using
top5<-names(head(sort(table(data$var1), decreasing = TRUE),5)
I would like the loop to check if another variable ("var2") equals A, if so set =1, OW =0, then give a summary using aggregate(). In Stata, I can refer to the looped variable i using `i' but not in R... The code that does not work is:
for(i in top5) {
data$i.dummy <- ifelse(data$var2=="i",1,0)
aggregate(data$i.dummy~data$age+data$year,data,mean)
}
Any suggestions?
If you want one column per item in your top 5 then I would use sapply along the elements in top5. No need for ifelse because == compares and gives TRUE or 1 if the comparison is TRUE and 0 otherwise
Here we cbind a matrix of 5 columns, one each for each element of top5 containing 1 if the row in data$var2 equals the respective element of 'top5':
data <- cbind( data , sapply( top5 , function(x) as.integer( data$var2 == x ) ) )
If you want one column for matches of any of top5 it's even easier:
data$dummies <- as.integer( data$var2 %in% top5 )
The as.integer() in both cases is used to turn TRUE or FALSE to 1 and 0 respectively.
A cut down example to illustrate how it works:
set.seed(123)
top2 <- c("A","B")
data <- data.frame( var2 = sample(LETTERS[1:4],6,repl=TRUE) )
# Make dummy variables, one column for each element in topX vector
data <- cbind( data , sapply( top2 , function(x) as.integer( data$var2 == x ) ) )
data
# var2 A B
#1 B 0 1
#2 D 0 0
#3 B 0 1
#4 D 0 0
#5 D 0 0
#6 A 1 0
# Make single column for all elements in topX vector
data$ANY <- as.integer( data$var2 %in% top2 )
data
# var2 ANY A B
#1 B 1 0 1
#2 D 0 0 0
#3 B 1 0 1
#4 D 0 0 0
#5 D 0 0 0
#6 A 1 1 0
See fortune(312), then read the help ?"[[" and possibly the help for paste0.
Then possibly consider using other tools like model.matrix and sapply rather than doing everything yourself using loops.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With