in R dplyr why do I need to ungroup() after I count()?

Tags:

When I first started programming in R I would often use dplyr count().

library(tidyverse)    
mtcars %>% count(cyl)

Once I started using apply functions I started running into issues with count(). If I simply added ungroup() to the end of my count()'s the problems would go away.

I don't have any particular reproducibles to show. But can somebody explain what the issue likely was, why ungroup() always fixed it, and are there any drawbacks to consistently using ungroup() after every count(), or after any group_by()? Of course I'm assuming I no longer need the data grouped after it's counted or summarized.

mtcars %>% count(cyl) %>% ungroup()

575

asked Jul 18 '18 14:07

stackinator

1 Answers

The issues you used to run into were from an old behavior of count(). Up to dplyr 0.5.0, if you did:

mtcars %>%
  count(cyl, wt)

The result would still be grouped by the cyl column. This means, for example, that if you followed it with something like summarize(mean(am)), you would have gotten one row for each cyl when you may have expected one row overall. The issue would be fixed if you put %>% ungroup() after the count.

This behavior was changed in dplyr 0.7.0 (released in June 2017), such that count() preserves the grouping of its input (meaning mtcars %>% count(wt, cyl) now returns an ungrouped table). This is likely why you're no longer able to reproduce the problems, and it means you no longer need to do ungroup() after a count().

Note that you may still need to do ungroup() after a group_by() and summarize():

mtcars %>%
  group_by(cyl, wt) %>%
  summarize(n = n())

returns a tibble still grouped by cyl:

# A tibble: 30 x 3
# Groups:   cyl [?]
     cyl    wt     n
   <dbl> <dbl> <int>
 1     4  1.51     1
 2     4  1.62     1
 3     4  1.84     1
 4     4  1.94     1
 5     4  2.14     1
 6     4  2.2      1
 7     4  2.32     1
 8     4  2.46     1
 9     4  2.78     1
10     4  3.15     1
# ... with 20 more rows

answered Oct 07 '22 23:10

David Robinson

Related questions
                            
                                Real time stock price R [closed]
                            
                                Geocode batch addresses in R with open mapquestapi
                            
                                R Data.Table Join on Conditionals
                            
                                Dynamically formatting individual axis labels in ggplot2
                            
                                Name list elements based on variable names R
                            
                                How to use eqnarray in R markdown for both html and pdf output?
                            
                                Installation of R-package "BH" not possible
                            
                                Read csv file in R with double quotes
                            
                                Crosstabs with data.table in R [duplicate]
                            
                                Dependency package "package_name" not available
                            
                                Add ylab to ggplot with fivethirtyeight ggtheme
                            
                                dynamic ggplot layers in shiny with nearPoints()
                            
                                Principal component analysis (PCA) of time series data: spatial and temporal pattern
                            
                                Why does is.na() change its argument?
                            
                                How to suppress automatic figure numbering in Rmarkdown / pandoc
                            
                                How to filter on partial match using sparklyr
                            
                                How to specify the size of a graph in ggplot2 independent of axis labels
                            
                                Change color of error messages in RMarkdown code output (HTML, PDF)
                            
                                Pipe operator %>% error with seq() function in R
                            
                                dplyr: Use a custom function in summarize() after group_by()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

in R dplyr why do I need to ungroup() after I count()?

Tags:

r

group-by

lapply

sapply

dplyr

stackinator

People also ask

1 Answers

David Robinson

Recent Activity

Donate For Us