Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to recode (and reverse code) variables in columns with dplyr

Tags:

r

dplyr

I am picking up R again after last using it in 2013. I am getting used to using dplyr, but I am running into a problem with a simple task. I have a table that looks like

Participant Q1       Q2      Q3     Q4       Q5
1           agree  neutral   NA    Disagree  Agree
2           neutral agree    NA     NA       NA

My goal

   Participant Q1       Q2      Q3     Q4       Q5
    1           3       2       NA      1       3
    2           2       1       NA     NA       NA

I want to be able to change the categorical value to a numerical value for columns Q1:Q5, but all the examples that I see of using recode for dplyr work for rows and no columns. (I might be missing something in the examples). I then want to be able to pick column Q1 and Q2 and reverse code it.

I am trying to learn to do this in dplyr if possible

Thanks

like image 995
user12081 Avatar asked Jun 29 '16 00:06

user12081


People also ask

How do I recode string variables in R?

Firstly, we use recode() available in dplyr package (Wickham et al., 2020). Then, we use ifelse() function to recode the categorical data to numeric variables. Last, we learn match() function to rename the character variable to numeric one. The data type of all recoded variables can be converted to each other using as.

How do I recode a level in R?

You can use recode() directly with factors; it will preserve the existing order of levels while changing the values. Alternatively, you can use recode_factor() , which will change the order of levels to match the order of replacements. See the forcats package for more tools for working with factors and their levels.


2 Answers

This is fairly straightforward now thanks to dplyr's recode function. Here's one way to do it:

# Generate a dataframe to match yours

df <- data.frame(
  participant = c(1,2),
  Q1 = c("agree", "neutral"),
  Q2 = c("neutral", "agree"),
  Q3 = c(NA,NA),
  Q4 = c("Disagree", NA),
  Q5 = c("Agree", NA)
)

# Use recode to recode the data

df_recode <- df %>%
  mutate(Q1 = recode(Q1, "agree" = 3, "neutral" = 2),
         Q2 = recode(Q2, "neutral" = 2, "agree" = 1),
         Q4 = recode(Q4, "Disagree" = 1),
         Q5 = recode(Q5, "Agree" = 3)
  )

You'll also want to read about the .default and .missing arguments in the help file to be sure you aren't introducing NAs when you don't mean to.

like image 106
griseus Avatar answered Oct 17 '22 00:10

griseus


We can do this in base R without using any package. Create a lookup named vector ('v1'), loop over the columns and use that vector to change the values in the columns

v1 <- setNames(c(1:3, 3), c("Disagree", "neutral", "agree", "Agree"))
df1[-1] <- lapply(df1[-1], function(x) if(any(!is.na(x))) v1[x] else NA)
df1 
#  Participant Q1 Q2 Q3 Q4 Q5
#1           1  3  2 NA  1  3
#2           2  2  3 NA NA NA

data

df1 <- structure(list(Participant = 1:2, Q1 = c("agree", "neutral"), 
Q2 = c("neutral", "agree"), Q3 = c(NA, NA), Q4 = c("Disagree", 
NA), Q5 = c("Agree", NA)), .Names = c("Participant", "Q1", 
"Q2", "Q3", "Q4", "Q5"), class = "data.frame", row.names = c(NA, -2L))
like image 20
akrun Avatar answered Oct 16 '22 22:10

akrun