Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate unique strings after groupby in R

Tags:

r

dplyr

I'm grouping a dataframe and want to concatenate unique strings.

data= data.frame(
aa=c(1,2,3,4,5,6,7,8,9,10),
bb=c('a','a','a','a','a','b','b','b','b','b'),
cc=c('hello','hello','hi','message','bye','q','w','r','r','t'))

Desired output:

bb    cc
a     'hello hi message bye'
b     'q w r t'

Currently I'm doing this(suggested here):

result<- data %>% 
  group_by(bb) %>%
  mutate(body = paste0(cc, collapse = "")) %>%
  summarise(t_body = first(body)

But I get all the strings not the unique ones.

like image 215
italo Avatar asked Jun 28 '18 19:06

italo


1 Answers

Use unique on cc before pasting it, and also no need for the mutate step, you can use summarize directly:

data %>% 
    group_by(bb) %>% 
    summarise(cc = paste(unique(cc), collapse = ' '))

# A tibble: 2 x 2
#  bb    cc                  
#  <fct> <chr>               
#1 a     hello hi message bye
#2 b     q w r t  
like image 82
Psidom Avatar answered Nov 01 '22 09:11

Psidom