Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to combine multiple character columns into one columns and remove NA without knowing column numbers

Tags:

r

dplyr

I would like to have a column that contains other columns characters without NA. I have tried paste, str_c and unite, but could not get the expected result. Maybe I used them incorrectly.

The real case is, I could not know the column numbers in advance since each dataset can be varied in terms of years.

i.e. some datasets contain 10 years, but some contain 20 years.

Here is the input data:

input <- tibble(
  id = c('aa', 'ss', 'dd', 'qq'),
  '2017' = c('tv', NA, NA, 'web'),
  '2018' = c(NA, 'web', NA, NA),
  '2019' = c(NA, NA, 'book', 'tv')
)

# A tibble: 4 x 4
  id    `2017` `2018` `2019`
  <chr> <chr>  <chr>  <chr> 
1 aa    tv     NA     NA    
2 ss    NA     web    NA    
3 dd    NA     NA     book  
4 qq    web    NA     tv    

The desired output with the ALL column is:

> output
# A tibble: 4 x 5
  id    `2017` `2018` `2019` ALL   
  <chr> <chr>  <chr>  <chr>  <chr> 
1 aa    tv     NA     NA     tv    
2 ss    NA     web    NA     web   
3 dd    NA     NA     book   book  
4 qq    web    NA     tv     web tv

Thanks for the help!

like image 963
J.D Avatar asked Mar 18 '19 04:03

J.D


2 Answers

Here is a base R method

input$ALL <- apply(input[-1], 1, function(x) paste(na.omit(x), collapse=" "))
input$ALL
#[1] "tv"     "web"    "book"   "web tv"
like image 154
akrun Avatar answered Oct 21 '22 18:10

akrun


This actually is duplicate (or is really close) of this question but things have changed since then. unite has na.rm parameter which helps to drop NAs.

As far as selection of columns is concerned, here we have selected all the columns ignoring the first one without specifying the column names so it should work for your case with multiple years.

library(tidyverse)

input %>%
    unite("ALL", names(input)[-1], remove = FALSE, sep = " ", na.rm = TRUE)

# A tibble: 4 x 5
#  id    ALL    `2017` `2018` `2019`
#  <chr> <chr>  <chr>  <chr>  <chr> 
#1 aa    tv     tv     NA     NA    
#2 ss    web    NA     web    NA    
#3 dd    book   NA     NA     book  
#4 qq    web tv web    NA     tv    

It worked for me after installing the development version of tidyr by doing

devtools::install_github("tidyverse/tidyr")
like image 45
Ronak Shah Avatar answered Oct 21 '22 17:10

Ronak Shah