Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recoding variables with R

Tags:

r

Recoding variables in R, seems to be my biggest headache. What functions, packages, processes do you use to ensure the best result?

I've found very few useful examples on the Internet that give a one-size-fits-all solution to recoding and I'm interested to see what you guys and gals are using.

Note: This may be a community wiki topic.

like image 737
Brandon Bertelsen Avatar asked Mar 21 '11 01:03

Brandon Bertelsen


People also ask

How do you recode variable names in R?

Firstly, we use recode() available in dplyr package (Wickham et al., 2020). Then, we use ifelse() function to recode the categorical data to numeric variables. Last, we learn match() function to rename the character variable to numeric one. The data type of all recoded variables can be converted to each other using as.

What is recoding in R?

Recoding Variables in R Recoding allows you to create new variables and to replace existing values of a variables based on a criterion. Replace the data for an existing variable. This way we can replace the data for every row without any criteria.

How do I recode a level in R?

How do I Rename Factor Levels in R? The simplest way to rename multiple factor levels is to use the levels() function. For example, to recode the factor levels “A”, “B”, and “C” you can use the following code: levels(your_df$Category1) <- c("Factor 1", "Factor 2", "Factor 3") .


2 Answers

Recoding can mean a lot of things, and is fundamentally complicated.

Changing the levels of a factor can be done using the levels function:

> #change the levels of a factor > levels(veteran$celltype) <- c("s","sc","a","l") 

Transforming a continuous variable simply involves the application of a vectorized function:

> mtcars$mpg.log <- log(mtcars$mpg)  

For binning continuous data look at cut and cut2 (in the hmisc package). For example:

> #make 4 groups with equal sample sizes > mtcars[['mpg.tr']] <- cut2(mtcars[['mpg']], g=4) > #make 4 groups with equal bin width > mtcars[['mpg.tr2']] <- cut(mtcars[['mpg']],4, include.lowest=TRUE) 

For recoding continuous or factor variables into a categorical variable there is recode in the car package and recode.variables in the Deducer package

> mtcars[c("mpg.tr2")] <- recode.variables(mtcars[c("mpg")] , "Lo:14 -> 'low';14:24 -> 'mid';else -> 'high';") 

If you are looking for a GUI, Deducer implements recoding with the Transform and Recode dialogs:

http://www.deducer.org/pmwiki/pmwiki.php?n=Main.TransformVariables

http://www.deducer.org/pmwiki/pmwiki.php?n=Main.RecodeVariables

like image 63
Ian Fellows Avatar answered Sep 22 '22 04:09

Ian Fellows


I found mapvalues from plyr package very handy. Package also contains function revalue which is similar to car:::recode.

The following example will "recode"

> mapvalues(letters, from = c("r", "o", "m", "a", "n"), to = c("R", "O", "M", "A", "N"))  [1] "A" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "M" "N" "O" "p" "q" "R" "s" "t" "u" "v" "w" "x" "y" "z" 
like image 33
Roman Luštrik Avatar answered Sep 23 '22 04:09

Roman Luštrik