How to combine multiple character columns into one columns and remove NA without knowing column numbers

Question

I would like to have a column that contains other columns characters without NA. I have tried paste, str_c and unite, but could not get the expected result. Maybe I used them incorrectly.

The real case is, I could not know the column numbers in advance since each dataset can be varied in terms of years.

i.e. some datasets contain 10 years, but some contain 20 years.

Here is the input data:

input <- tibble(
  id = c('aa', 'ss', 'dd', 'qq'),
  '2017' = c('tv', NA, NA, 'web'),
  '2018' = c(NA, 'web', NA, NA),
  '2019' = c(NA, NA, 'book', 'tv')
)

# A tibble: 4 x 4
  id    `2017` `2018` `2019`
  <chr> <chr>  <chr>  <chr> 
1 aa    tv     NA     NA    
2 ss    NA     web    NA    
3 dd    NA     NA     book  
4 qq    web    NA     tv

The desired output with the ALL column is:

> output
# A tibble: 4 x 5
  id    `2017` `2018` `2019` ALL   
  <chr> <chr>  <chr>  <chr>  <chr> 
1 aa    tv     NA     NA     tv    
2 ss    NA     web    NA     web   
3 dd    NA     NA     book   book  
4 qq    web    NA     tv     web tv

Thanks for the help!

akrun · Accepted Answer

Here is a base R method

input$ALL <- apply(input[-1], 1, function(x) paste(na.omit(x), collapse=" "))
input$ALL
#[1] "tv"     "web"    "book"   "web tv"

Ronak Shah · Answer

This actually is duplicate (or is really close) of this question but things have changed since then. unite has na.rm parameter which helps to drop NAs.

As far as selection of columns is concerned, here we have selected all the columns ignoring the first one without specifying the column names so it should work for your case with multiple years.

library(tidyverse)

input %>%
    unite("ALL", names(input)[-1], remove = FALSE, sep = " ", na.rm = TRUE)

# A tibble: 4 x 5
#  id    ALL    `2017` `2018` `2019`
#  <chr> <chr>  <chr>  <chr>  <chr> 
#1 aa    tv     tv     NA     NA    
#2 ss    web    NA     web    NA    
#3 dd    book   NA     NA     book  
#4 qq    web tv web    NA     tv

It worked for me after installing the development version of tidyr by doing

devtools::install_github("tidyverse/tidyr")

How to combine multiple character columns into one columns and remove NA without knowing column numbers

Tags:

r

dplyr

J.D

2 Answers

akrun

Ronak Shah

Recent Activity

Donate For Us

How to combine multiple character columns into one columns and remove NA without knowing column numbers

Tags:

r

dplyr

J.D

2 Answers

akrun

Ronak Shah

Related questions

Recent Activity

Donate For Us