I would like to have a column that contains other columns characters without NA.
I have tried paste
, str_c
and unite
, but could not get the expected result. Maybe I used them incorrectly.
The real case is, I could not know the column numbers in advance since each dataset can be varied in terms of years.
i.e. some datasets contain 10 years, but some contain 20 years.
Here is the input data:
input <- tibble(
id = c('aa', 'ss', 'dd', 'qq'),
'2017' = c('tv', NA, NA, 'web'),
'2018' = c(NA, 'web', NA, NA),
'2019' = c(NA, NA, 'book', 'tv')
)
# A tibble: 4 x 4
id `2017` `2018` `2019`
<chr> <chr> <chr> <chr>
1 aa tv NA NA
2 ss NA web NA
3 dd NA NA book
4 qq web NA tv
The desired output with the ALL column is:
> output
# A tibble: 4 x 5
id `2017` `2018` `2019` ALL
<chr> <chr> <chr> <chr> <chr>
1 aa tv NA NA tv
2 ss NA web NA web
3 dd NA NA book book
4 qq web NA tv web tv
Thanks for the help!
Here is a base R
method
input$ALL <- apply(input[-1], 1, function(x) paste(na.omit(x), collapse=" "))
input$ALL
#[1] "tv" "web" "book" "web tv"
This actually is duplicate (or is really close) of this question but things have changed since then. unite
has na.rm
parameter which helps to drop NA
s.
As far as selection of columns is concerned, here we have selected all the columns ignoring the first one without specifying the column names so it should work for your case with multiple years.
library(tidyverse)
input %>%
unite("ALL", names(input)[-1], remove = FALSE, sep = " ", na.rm = TRUE)
# A tibble: 4 x 5
# id ALL `2017` `2018` `2019`
# <chr> <chr> <chr> <chr> <chr>
#1 aa tv tv NA NA
#2 ss web NA web NA
#3 dd book NA NA book
#4 qq web tv web NA tv
It worked for me after installing the development version of tidyr
by doing
devtools::install_github("tidyverse/tidyr")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With