Explain ungroup() in dplyr

Tags:

If I'm working with a dataset and I want to group the data (i.e. by country), compute a summary statistic (mean()) and then ungroup() the data.frame to have a dataset with the original dimensions (country-year) and a new column that lists the mean for each country (repeated over n years), how would I do that with dplyr? The ungroup() function doesn't return a data.frame with the original dimensions:

gapminder %>%
    group_by(country) %>%
    summarize(mn = mean(pop)) %>%
    ungroup() # returns data.frame with nrows == length(unique(gapminder$country))

844

asked Jan 25 '18 15:01

Emily

2 Answers

ungroup() is useful if you want to do something like

gapminder %>%
group_by(country) %>%
mutate(mn = pop/mean(pop)) %>%
ungroup()

where you want to do some sort of transformation that uses an entire group's statistics. In the above example, mn is the ratio of a population to the group's average population. When it is ungrouped, any further mutations called on it would not use the grouping for aggregate statistics.

summarize automatically reduces the dimensions, and there's no way to get that back. Perhaps you wanted to do

gapminder %>%
group_by(country) %>%
mutate(mn = mean(pop)) %>%
ungroup()

Which creates mn as the mean for each group, replicated for each row within that group.

answered Oct 09 '22 11:10

Max Candocia

The summarize() reduced the number of rows. If you didn't want to change the number of rows, then use mutate() rather than summarize().

answered Oct 09 '22 11:10

MrFlick

Related questions
                            
                                How can I calculate the means of rows while excluding the zero values from rows in data frame
                            
                                Finding the index of an NA value in a vector [duplicate]
                            
                                Cumulative sum that resets when 0 is encountered
                            
                                Performing a dplyr full_join without a common variable to blend data frames
                            
                                Pivoting rows into columns
                            
                                How can a line be overlaid on a bar plot using ggplot2?
                            
                                Install RPostgreSQL on RHEL 6.5 libpq-fe.h Error
                            
                                Can't change params in Rmd documents
                            
                                How do I flip rows and columns in R
                            
                                Label lines in a plot
                            
                                UTF-8 file output in R
                            
                                Running R scripts from VBA
                            
                                Collapse rows in a data frame using R
                            
                                Does R have any package for parsing out the parts of a URL?
                            
                                Label individual panels in a multi-panel ggplot2
                            
                                Convert a vector of string to a vector of integer
                            
                                executing cv.glmnet in parallel in R
                            
                                Fastest way to extract hour from time (HH:MM)
                            
                                How do I remove verbs, prepositions, conjunctions etc from my text? [closed]
                            
                                Text labels with background colour in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With