I am picking up R again after last using it in 2013. I am getting used to using dplyr, but I am running into a problem with a simple task. I have a table that looks like
Participant Q1 Q2 Q3 Q4 Q5
1 agree neutral NA Disagree Agree
2 neutral agree NA NA NA
My goal
Participant Q1 Q2 Q3 Q4 Q5
1 3 2 NA 1 3
2 2 1 NA NA NA
I want to be able to change the categorical value to a numerical value for columns Q1:Q5, but all the examples that I see of using recode for dplyr work for rows and no columns. (I might be missing something in the examples). I then want to be able to pick column Q1 and Q2 and reverse code it.
I am trying to learn to do this in dplyr if possible
Thanks
Firstly, we use recode() available in dplyr package (Wickham et al., 2020). Then, we use ifelse() function to recode the categorical data to numeric variables. Last, we learn match() function to rename the character variable to numeric one. The data type of all recoded variables can be converted to each other using as.
You can use recode() directly with factors; it will preserve the existing order of levels while changing the values. Alternatively, you can use recode_factor() , which will change the order of levels to match the order of replacements. See the forcats package for more tools for working with factors and their levels.
This is fairly straightforward now thanks to dplyr's recode
function. Here's one way to do it:
# Generate a dataframe to match yours
df <- data.frame(
participant = c(1,2),
Q1 = c("agree", "neutral"),
Q2 = c("neutral", "agree"),
Q3 = c(NA,NA),
Q4 = c("Disagree", NA),
Q5 = c("Agree", NA)
)
# Use recode to recode the data
df_recode <- df %>%
mutate(Q1 = recode(Q1, "agree" = 3, "neutral" = 2),
Q2 = recode(Q2, "neutral" = 2, "agree" = 1),
Q4 = recode(Q4, "Disagree" = 1),
Q5 = recode(Q5, "Agree" = 3)
)
You'll also want to read about the .default
and .missing
arguments in the help file to be sure you aren't introducing NAs
when you don't mean to.
We can do this in base R
without using any package. Create a lookup named vector ('v1'), loop over the columns and use that vector to change the values in the columns
v1 <- setNames(c(1:3, 3), c("Disagree", "neutral", "agree", "Agree"))
df1[-1] <- lapply(df1[-1], function(x) if(any(!is.na(x))) v1[x] else NA)
df1
# Participant Q1 Q2 Q3 Q4 Q5
#1 1 3 2 NA 1 3
#2 2 2 3 NA NA NA
df1 <- structure(list(Participant = 1:2, Q1 = c("agree", "neutral"),
Q2 = c("neutral", "agree"), Q3 = c(NA, NA), Q4 = c("Disagree",
NA), Q5 = c("Agree", NA)), .Names = c("Participant", "Q1",
"Q2", "Q3", "Q4", "Q5"), class = "data.frame", row.names = c(NA, -2L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With