I am transforming the results of a survey, including multiple selection responses. The original data looks like this:
df <- data_frame(
id = c("a", "b", "c"),
tired = c(T, F, T),
lonely = c(F, F, T),
excited = c(F, T, T)
)
df
# A tibble: 3 x 4
id tired lonely excited
<chr> <lgl> <lgl> <lgl>
1 a TRUE FALSE FALSE
2 b FALSE FALSE TRUE
3 c TRUE TRUE TRUE
I would like to create a new column "feelings" that contains comma separated values of the feelings expressed by a respondent:
id feelings
<chr> <chr>
1 a tired, excited
2 b excited
3 c tired, lonely, excited
An intermediate step would be to replace TRUE values with the respective name of the column in order to yield:
id tired lonely excited
<chr> <lgl> <lgl> <lgl>
1 a tired excited
2 b excited
3 c tired lonely excited
For an individual column this is straightforward. However, unlike the example, there are a lot of columns in my data frame (10+, with usually no more than one or two TRUE values), and therefore I would like to automate this for a number of columns. One solution would probably be to loop over the columns and use base subsetting and replacement, but is there also an elegant dplyr/tidy way to do this?
Thanks for your help!
An option is to use tidyr::gather
and then summarise using dplyr
:
library(dplyr)
library(tidyr)
df %>% gather(feelings, value, -id) %>% #Change to long format
filter(value) %>% #Filter for value which are TRUE
group_by(id) %>%
summarise(feelings= paste0(feelings,collapse=","))
# id feelings
# <chr> <chr>
# 1 a tired
# 2 b excited
# 3 c tired,lonely,excited
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With