Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summarise dataframe to include all unique values in a grouping

Tags:

dataframe

r

dplyr

I'd like to summarise a dataframe such that a column contains a string of the unique values within a particular group. So using the iris dataset:

iris %>%
  group_by(Species) %>%
  summarise(mPW=mean(Petal.Width))

This gives the mean of Petal.Width grouped by Species. But what if I want as an output all the values that were used to calculate that mean. I want those unique values in a list though not in the R meaning of a list. I tried this but obviously that was wrong:

 iris %>%
   group_by(Species) %>%
   summarise(lPW=paste(Petal.Width, sep=","))

Here is a truncated desired example dataframe output. Note the desired output for LPW is a character object:

 Species lPW
 setosa  0.1,0.2,0.3,0.4,0.5,0.6
 ....

I'm not set on dplyr solution. This is just the way I normally work.

Thanks in advance.

like image 635
boshek Avatar asked Jan 06 '23 21:01

boshek


2 Answers

Promoting my comment to an answer: use collapse instead of sep:

iris %>%
  group_by(Species) %>%
  summarise(lPW = paste(Petal.Width, collapse=","))

If you want to limit this to only the unique values, you can use:

iris %>%
  group_by(Species) %>%
  summarize(lPW = paste(unique(Petal.Width), collapse = ","))
like image 98
Jaap Avatar answered Jan 10 '23 05:01

Jaap


# dplyr_0.4.3
iris %>%
select(Species, Petal.Width) %>%
mutate(Petal.Width = as.character(Petal.Width)) %>%
unique() %>%
group_by(Species) %>%
summarize(lPW = paste(as.character(Petal.Width), collapse = ","))
like image 44
Jubbles Avatar answered Jan 10 '23 06:01

Jubbles