Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R, dplyr - combination of group_by() and arrange() does not produce expected result?

Tags:

r

dplyr

when using dplyr function group_by() and immediately afterwards arrange(), I would expect to get an output where data frame is ordered within groups that I stated in group_by(). My reading of documentation is that this combination should produce such a result, however when I tried it this is not what I get, and googling did not indicate that other people ran into the same issue. Am I wrong in expecting this result?

Here is an example, using the R built-in dataset ToothGrowth:

library(dplyr) ToothGrowth %>%   group_by(supp) %>%   arrange(len) 

Running this will produce a data frame where the whole data frame is ordered according to len and not within supp factors.

This is the code that produces the desired output:

ToothGrowth %>%   group_by(supp) %>%   do( data.frame(with(data=., .[order(len),] )) ) 
like image 539
Hrvoje Avatar asked Jul 09 '14 09:07

Hrvoje


People also ask

What is the use of Arrange () with dplyr package?

arrange() orders the rows of a data frame by the values of selected columns.

What does %>% do in dplyr?

%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).

What is the purpose of Group_by () function?

Group_by() function belongs to the dplyr package in the R programming language, which groups the data frames. Group_by() function alone will not give any output. It should be followed by summarise() function with an appropriate action to perform. It works similar to GROUP BY in SQL and pivot table in excel.

Does Group_by preserve order?

Groupby preserves the order of rows within each group. Thus, it is clear the "Groupby" does preserve the order of rows within each group.


1 Answers

You can produce the expected behaviour by setting .by_group = TRUE in arrange:

library(dplyr) ToothGrowth %>%     group_by(supp) %>%     arrange(len, .by_group = TRUE) 
like image 104
David Rubinger Avatar answered Oct 07 '22 11:10

David Rubinger