Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Opposite of tidyr::separate, concatenating multiple columns into one

Tags:

r

dplyr

tidyr

I have a data frame:

df <- data.frame(
    id = c(1, 2, 3),
    `1` = c("W4", "W5", 49),
    `2` = c("L", "O1", "P6"),
    `3` = c(1, 2, 10),
    `4` = c("H7", NA, "K"),
    `5` = c("J8", NA, NA)
)

How can I concatenate/paste the columns together with sep = ","

(The opposite of tidyr::separate(), I guess?)

Desired output:

id  string
1   W4, L, 1, H7, J8
2   W5, O1, 2
3   49, P6, 10, K

Thanks in advance!

EDIT

I'm wary of using paste because in my real dataset I have 1000 columns.

like image 743
emehex Avatar asked Jul 15 '16 13:07

emehex


People also ask

What is the opposite of separate in R?

The opposite of separate is unite .

How do I concatenate columns in R?

How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.

How do I split a column into multiple columns in R?

To split a column into multiple columns in the R Language, we use the separator() function of the dplyr package library. The separate() function separates a character column into multiple columns with a regular expression or numeric locations.

What is Tidyr?

tidyr is new package that makes it easy to “tidy” your data. Tidy data is data that's easy to work with: it's easy to munge (with dplyr), visualise (with ggplot2 or ggvis) and model (with R's hundreds of modelling packages). The two most important properties of tidy data are: Each column is a variable.


2 Answers

You can use the unite function from tidyr:

library(tidyr)
unite(df, string, X1:X5, sep = ", ")
#  id            string
#1  1  W4, L, 1, H7, J8
#2  2 W5, O1, 2, NA, NA
#3  3 49, P6, 10, K, NA

Note that it also has a remove argument that is TRUE by default. If you set it to FALSE, the original columns are kept in the data.

For the column specification (which columns to unite) you can use the colon operator (:) as I did above or use the special functions described in ?dplyr::select.

like image 115
talat Avatar answered Sep 29 '22 16:09

talat


We can do this in base R without any packages

data.frame(id = df[1], string= do.call(paste, c(df[-1], sep=",")))
#  id        string
#1  1  W4,L,1,H7,J8
#2  2 W5,O1,2,NA,NA
#3  3 49,P6,10,K,NA
like image 20
akrun Avatar answered Sep 29 '22 17:09

akrun