Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GROUP_CONCAT with dplyr or R [duplicate]

I am having difficulty replicating the functionality of a typical SQL GROUP_CONCAT function in dplyr. I would also like to make sure the ordering inside the groups can be controlled. Ideally I want to use the hadleyverse/tidyverse but base R or other packages will work too.

Example data:

ID    name
1     apple
1     orange
2     orange
3     orange
3     apple

Desired output:

ID    name
1     apple,orange
2     orange
3     apple,orange

Note that for ID=3, the ordering is in alpha order, not how the rows are ordered. I think this can probably be handled by doing an arrange first, but it would be nice to control inside the summarise statement or the like.

like image 292
rikturr Avatar asked Dec 24 '22 04:12

rikturr


2 Answers

In R, we can use one of the group by operations.

library(dplyr)
df1 %>%
    group_by(ID) %>%
    summarise(name = toString(sort(unique(name))))
#     ID          name 
#   <int>         <chr>
#1     1 apple, orange
#2     2        orange
#3     3 apple, orange

Or using data.table

library(data.table)
setDT(df1)[, .(name = toString(sort(unique(name)))), by = ID]
#   ID          name
#1:  1 apple, orange
#2:  2        orange
#3:  3 apple, orange
like image 166
akrun Avatar answered Jan 02 '23 23:01

akrun


For base R use this

aggregate(data=df,name~ID,FUN = function(t) sort(paste(t)))

Data

df<-read.table(header = T,text = "ID    name
1     apple
1     orange
2     orange
3     orange
3     apple")
like image 37
user2100721 Avatar answered Jan 02 '23 23:01

user2100721