I have data that looks like this:
A set of 10 character variables
Char<-c("A","B","C","D","E","F","G","H","I","J")
And a data frame that looks like this
Col1<-seq(1:25)
Col2<-c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,5)
DF<-data.frame(Col1,Col2)
What I would like to do is to add a third column to the data frame, with the logic that 1=A, 2=B, 3= C and so on. So the end result would be
Col3<-c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C","D","D","D","D","D","E","E","E","E","E")
DF<-data.frame(Col1,Col2,Col3)
For this simple example I could go with a simple substitution like this question: Create new column based on 4 values in another column
But my actual data set is much bigger with a lot more variables than this simple example, so writing out the equivalents as in the above answer is not a possibility.
So I would like to have a bit of code that can be applied to a much larger data frame. Perhaps something that looped through all the values of Col2 and matched them to the location of Char.
1=Char[1] 2=Char[2] 3=Char[3]...... for the entire length of Col2
Or any other way that could scale up to a long monstrous data frame
# Values that Col2 might have taken
levels = c(1, 2, 3, 4, 5)
# Labels for the levels in same order as levels
labels = c('A', 'B', 'C', 'D', 'E')
DF$Col3 <- factor(DF$Col2, levels = levels, labels = labels)
I know it may be taboo to use for loops in R, but I tried this out and it worked well.
for (i in length(DF$Col2)) {
DF$Col3[i] <- Char[DF$Col2[i]]
}
Would that be sufficient? I think you could also unique(DF$Col2)
or levels(factor(DF$Col2))
Perhaps though I'm misunderstanding your question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With