Recoding variables in R, seems to be my biggest headache. What functions, packages, processes do you use to ensure the best result?
I've found very few useful examples on the Internet that give a one-size-fits-all solution to recoding and I'm interested to see what you guys and gals are using.
Note: This may be a community wiki topic.
Firstly, we use recode() available in dplyr package (Wickham et al., 2020). Then, we use ifelse() function to recode the categorical data to numeric variables. Last, we learn match() function to rename the character variable to numeric one. The data type of all recoded variables can be converted to each other using as.
Recoding Variables in R Recoding allows you to create new variables and to replace existing values of a variables based on a criterion. Replace the data for an existing variable. This way we can replace the data for every row without any criteria.
How do I Rename Factor Levels in R? The simplest way to rename multiple factor levels is to use the levels() function. For example, to recode the factor levels “A”, “B”, and “C” you can use the following code: levels(your_df$Category1) <- c("Factor 1", "Factor 2", "Factor 3") .
Recoding can mean a lot of things, and is fundamentally complicated.
Changing the levels of a factor can be done using the levels
function:
> #change the levels of a factor > levels(veteran$celltype) <- c("s","sc","a","l")
Transforming a continuous variable simply involves the application of a vectorized function:
> mtcars$mpg.log <- log(mtcars$mpg)
For binning continuous data look at cut
and cut2
(in the hmisc package). For example:
> #make 4 groups with equal sample sizes > mtcars[['mpg.tr']] <- cut2(mtcars[['mpg']], g=4) > #make 4 groups with equal bin width > mtcars[['mpg.tr2']] <- cut(mtcars[['mpg']],4, include.lowest=TRUE)
For recoding continuous or factor variables into a categorical variable there is recode
in the car package and recode.variables
in the Deducer package
> mtcars[c("mpg.tr2")] <- recode.variables(mtcars[c("mpg")] , "Lo:14 -> 'low';14:24 -> 'mid';else -> 'high';")
If you are looking for a GUI, Deducer implements recoding with the Transform and Recode dialogs:
http://www.deducer.org/pmwiki/pmwiki.php?n=Main.TransformVariables
http://www.deducer.org/pmwiki/pmwiki.php?n=Main.RecodeVariables
I found mapvalues
from plyr
package very handy. Package also contains function revalue
which is similar to car:::recode
.
The following example will "recode"
> mapvalues(letters, from = c("r", "o", "m", "a", "n"), to = c("R", "O", "M", "A", "N")) [1] "A" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "M" "N" "O" "p" "q" "R" "s" "t" "u" "v" "w" "x" "y" "z"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With