Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Collapse text by group in data frame [duplicate]

Tags:

How do I aggregate data frame by group in column group and collapse text in column text?

Sample data:

df <- read.table(header=T, text=" group text a a1 a a2 a a3 b b1 b b2 c c1 c c2 c c3 ") 

Required output (data frame):

group text a     a1a2a3 b     b1b2 c     c1c2c3 

Now I have:

sapply(unique(df$group), function(x) {   paste0(df[df$group==x,"text"], collapse='') }) 

This works to some extent as it returns text properly collapsed by group, but as a vector:

[1] "a1a2a3" "b1b2"   "c1c2c3" 

I need a data frame with group column as a result.

like image 800
Tomas Greif Avatar asked Mar 31 '14 07:03

Tomas Greif


1 Answers

Simply use aggregate :

aggregate(df$text, list(df$group), paste, collapse="") ##   Group.1      x ## 1       a a1a2a3 ## 2       b   b1b2 ## 3       c c1c2c3 

Or with plyr

library(plyr) ddply(df, .(group), summarize, text=paste(text, collapse="")) ##   group   text ## 1     a a1a2a3 ## 2     b   b1b2 ## 3     c c1c2c3 

ddply is faster than aggregate if you have a large dataset.

EDIT : With the suggestion from @SeDur :

aggregate(text ~ group, data = df, FUN = paste, collapse = "") ##   group   text ## 1     a a1a2a3 ## 2     b   b1b2 ## 3     c c1c2c3 

For the same result with earlier method you have to do :

aggregate(x=list(text=df$text), by=list(group=df$group), paste, collapse="") 

EDIT2 : With data.table :

library("data.table") dt <- as.data.table(df) dt[, list(text = paste(text, collapse="")), by = group] ##    group   text ## 1:     a a1a2a3 ## 2:     b   b1b2 ## 3:     c c1c2c3 
like image 51
Victorp Avatar answered Oct 31 '22 06:10

Victorp