Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenating two text columns in dplyr

Tags:

My data looks like this:

round <- c(rep("A", 3), rep("B", 3)) experiment <- rep(c("V1", "V2", "V3"), 2) results <- rnorm(mean = 10, n = 6)  df <- data.frame(round, experiment, results)  > df   round experiment   results 1     A         V1  9.782025 2     A         V2  8.973996 3     A         V3  9.271109 4     B         V1  9.374961 5     B         V2  8.313307 6     B         V3 10.837787 

I have a different dataset that will be merged with this one where each combo of round and experiment is a unique row value, ie, "A_V1". So what I really want is a variable name that concatenates the two columns together. However, this is tougher to do in dplyr than I expected. I tried:

name_mix <- paste0(df$round, "_", df$experiment) new_df <- df %>%   mutate(name = name_mix) %>%   select(name, results) 

But I got the error, Column name must be length 1 (the group size), not 6. I also tried the simple base-R approach of cbind(df, name_mix) but received a similar error telling me that df and name_mix were of different sizes. What am I doing wrong?

like image 480
mmyoung77 Avatar asked Jun 13 '18 20:06

mmyoung77


People also ask

How do I combine two columns in Dplyr?

How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.

How do I concatenate strings in R Dplyr?

It is easy to implement that with the help of dplyr package. To concatenate by group in R you can use a paste with a collapse argument within mutate to return all rows in the dataset with results in a separate column or summarise to return only group values with results.

How do I concatenate data in R?

To concatenate strings in r programming, use paste() function. The syntax of paste() function that is used to concatenate two or more strings.


2 Answers

You can use the unite function from tidyr

require(tidyverse)  df %>%    unite(round_experiment, c("round", "experiment"))    round_experiment   results 1             A_V1  8.797624 2             A_V2  9.721078 3             A_V3 10.519000 4             B_V1  9.714066 5             B_V2  9.952211 6             B_V3  9.642900 
like image 197
DJV Avatar answered Oct 25 '22 05:10

DJV


This should do the trick if you are looking for a new variable

library(tidyverse)  round <- c(rep("A", 3), rep("B", 3)) experiment <- rep(c("V1", "V2", "V3"), 2) results <- rnorm(mean = 10, n = 6)  df <- data.frame(round, experiment, results) df  df <- df %>% mutate(   name = paste(round, experiment, sep = "_") ) 
like image 23
Claus Portner Avatar answered Oct 25 '22 04:10

Claus Portner