I am looking for some help to what seems like a very simple question. Any advice is greatly appreciated! I have created a data frame and I am looking to assign names under one column (Region) based on the values in the other column (Unit).
rdf<-as.data.frame(matrix(NA, nrow= 59, ncol=2))
colnames(rdf)<-c("Unit", "Region")
rdf$Unit<-c(1:35, 37:60)
rdf$Region<- ## (See below)
Here I want for Units 1:13 <- the region to be East, for Units 14:25 and 27 the region to be labeled Central, units 26, 28:38, 40:43, 45:46, to be labeled West, and then Units 44, 39, 47:60, to be labeled BC. I've been trying case_when or nested if else statements, but I am getting errors relating to a longer object length is not a multiple of shorter object length.
Because this is a small data frame, using a translation table might be the most convenient way:
xlat <- list(East=c(1:13),
Central=c(14:25,27),
West=c(26,28:38,40:43,45:46),
BC=c(44,39,47:60))
rdf$Region <- NA
for (r in names(xlat)) rdf$Region[rdf$Unit %in% xlat[[r]]] <- r
This solution (a) clearly documents the recoding; (b) will indicate if you have overlooked any unit number (by setting Region
to NA
); and (c) is easy to alter and maintain.
For larger tables, learn about the join operation among relations. It has many implementations in R
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With