Let's say I have the following "key" dataset:
key = read.table(text = "question r_answer d_answer
20 A B
21 B A
22 A B
23 B A
24 A B
25 B A", header = T)
> key
question r_answer d_answer
1 20 A B
2 21 B A
3 22 A B
4 23 B A
5 24 A B
6 25 B A
This tells me, for a given question, what answer an "R" would give, and what answer a "D" would give.
Now let's say this is the dataset:
data = read.table(text = "person_id question answer
1 20 A
1 21 B
1 22 A
1 23 B
1 24 A
1 25 B
2 20 A
2 21 A
2 23 A
2 24 B
2 25 B", header = T)
>data
person_id question answer
1 1 20 A
2 1 21 B
3 1 22 A
4 1 23 B
5 1 24 A
6 1 25 B
7 2 20 A
8 2 21 A
9 2 23 A
10 2 24 B
11 2 25 B
This tells me, for a given person, what their actual answers were. I want to create an answer_type
column in data that is equal to either r_answer
or d_answer
depending on the values listed in key. The resulting output would be:
person_id question answer answer_type
1 1 20 A r_answer
2 1 21 B r_answer
3 1 22 A r_answer
4 1 23 B r_answer
5 1 24 A r_answer
6 1 25 B r_answer
7 2 20 A r_answer
8 2 21 A d_answer
9 2 23 A d_answer
10 2 24 B d_answer
11 2 25 B r_answer
I have a feeling that the answer will involve merging from dplyr, but I can't quite figure it out.
One dplyr
and tidyr
option could be:
data %>%
left_join(key %>%
pivot_longer(-question, names_to = "answer_type"), by = c("question" = "question",
"answer" = "value"))
person_id question answer answer_type
1 1 20 A r_answer
2 1 21 B r_answer
3 1 22 A r_answer
4 1 23 B r_answer
5 1 24 A r_answer
6 1 25 B r_answer
7 2 20 A r_answer
8 2 21 A d_answer
9 2 23 A d_answer
10 2 24 B d_answer
11 2 25 B r_answer
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With