Question Using <code>dplyr</code>, how do I select the top and bottom observations/rows of grouped data in one statement? Data & Example Given a data frame: <pre class="prettyprint"><code>df <- data.frame(id=c(1,1,1,2,2,2,3,3,3), stopId=c("a","b","c","a","b","c","a","b","c"), stopSequence=c(1,2,3,3,1,4,3,1,2)) </code></pre> I can get the top and bottom observations from each group using <code>slice</code>, but using two separate statements: <pre class="prettyprint"><code>firstStop <- df %>% group_by(id) %>% arrange(stopSequence) %>% slice(1) %>% ungroup lastStop <- df %>% group_by(id) %>% arrange(stopSequence) %>% slice(n()) %>% ungroup </code></pre> Can I combine these two statements into one that selects both top and bottom observations?

There is probably a faster way: <pre class="prettyprint"><code>df %>% group_by(id) %>% arrange(stopSequence) %>% filter(row_number()==1 | row_number()==n()) </code></pre>

Select first and last row from grouped data

Q: How do I select specific rows in SQL?

To select rows using selection symbols for character or graphic data, use the LIKE keyword in a WHERE clause, and the underscore and percent sign as selection symbols. You can create multiple row conditions, and use the AND, OR, or IN keywords to connect the conditions.

Tags:

r

dplyr

Question

Using dplyr, how do I select the top and bottom observations/rows of grouped data in one statement?

Data & Example

Given a data frame:

df <- data.frame(id=c(1,1,1,2,2,2,3,3,3),                   stopId=c("a","b","c","a","b","c","a","b","c"),                   stopSequence=c(1,2,3,3,1,4,3,1,2))

I can get the top and bottom observations from each group using slice, but using two separate statements:

firstStop <- df %>%   group_by(id) %>%   arrange(stopSequence) %>%   slice(1) %>%   ungroup  lastStop <- df %>%   group_by(id) %>%   arrange(stopSequence) %>%   slice(n()) %>%   ungroup

Can I combine these two statements into one that selects both top and bottom observations?

629

asked Jul 21 '15 01:07

tospig

1 Answers

There is probably a faster way:

df %>%   group_by(id) %>%   arrange(stopSequence) %>%   filter(row_number()==1 | row_number()==n())

answered Sep 27 '22 19:09

jeremycg

Related questions
                            
                                Fastest way to replace NAs in a large data.table
                            
                                Global variables in R
                            
                                Insert picture/table in R Markdown [closed]
                            
                                How to generate a number of most distinctive colors in R?
                            
                                Check for installed packages before running install.packages() [duplicate]
                            
                                Is there a way to make R beep/play a sound at the end of a script?
                            
                                promise already under evaluation: recursive default argument reference or earlier problems?
                            
                                Count number of occurences for each unique value
                            
                                Determining memory usage of objects?
                            
                                What does %>% function mean in R?
                            
                                Returning multiple objects in an R function [duplicate]
                            
                                Summarizing multiple columns with dplyr? [duplicate]
                            
                                How to interpret dplyr message `summarise()` regrouping output by 'x' (override with `.groups` argument)?
                            
                                Select rows of a matrix that meet a condition
                            
                                Group by multiple columns in dplyr, using string vector input
                            
                                Remove plot axis values
                            
                                How to set size for local image using knitr for markdown?
                            
                                Add new row to dataframe, at specific row-index, not appended?
                            
                                What's the difference between lapply and do.call?
                            
                                How can a add a row to a data frame in R?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With