k_th smallest element per group in R

Question

I have a data table, dt, that looks like

   location year value
       NYC 2026     1
       NYC 2026     2
       NYC 2026     3
       NYC 2026     4
       NYC 2026     5
        LA 2026     6
        LA 2026     7
        LA 2026     8
        LA 2026     9
        LA 2026    10

I would like to group them by city and year and find the 2nd smallest element in the column value, per each group, where the desired result looks like:

   location year value
        NYC  2026     2
         LA  2026     7

dt %>% grou_by(location, year) %>% nth(value, 2)

Would not work. Any help is appreciated.

The data table above can be created by:

dt <- structure(list(location = c("NYC", "NYC", "NYC","NYC", "NYC", 
                                   "LA", "LA", "LA", "LA", "LA"), 
                 year = c(2026, 2026, 2026, 2026, 2026,
                          2026, 2026, 2026, 2026, 2026),
                 value = c(1, 2, 3, 4, 5,
                           6, 7, 8, 9, 10)),
                 class = "data.table", 
                 row.names = c(NA, -10L))

tmfmnk · Accepted Answer

One dplyr possibility could be:

df %>%
 group_by(location) %>%
 arrange(value) %>%
 slice(2)

Here it groups by "location" column, arranges the values according "value" column and then keeps the second element.

  location  year value
  <chr>    <int> <int>
1 LA        2026     7
2 NYC       2026     2

Or if the values in "value" column could be duplicate, then you can do:

df %>%
 group_by(location) %>%
 distinct(value, .keep_all = TRUE) %>%
 arrange(value) %>%
 slice(2)

Or using filter() instead of slice():

df %>%
 group_by(location) %>%
 arrange(value) %>%
 filter(row_number() == 2)

The same considering also possible duplicates:

df %>%
 group_by(location) %>%
 distinct(value, .keep_all = TRUE) %>%
 arrange(value) %>%
 filter(row_number() == 2)

Or using filter() and dense_rank():

df %>%
 group_by(location) %>%
 filter(dense_rank(value) == 2)

The same considering also possible duplicates:

df %>%
 group_by(location) %>%
 distinct(value, .keep_all = TRUE) %>%
 filter(dense_rank(value) == 2)

liuminzhao · Answer

Using summarize to work with group_by:

> dt %>% group_by(location, year) %>% arrange(value) %>%  summarize(value = nth(value, 2))
# A tibble: 2 x 3
# Groups:   location [2]
  location  year value
  <chr>    <dbl> <dbl>
1 LA        2026     7
2 NYC       2026     2

k_th smallest element per group in R

Tags:

sorting

r

OverFlow Police

2 Answers

tmfmnk

liuminzhao

Recent Activity

Donate For Us

k_th smallest element per group in R

Tags:

sorting

r

OverFlow Police

2 Answers

tmfmnk

liuminzhao

Related questions

Recent Activity

Donate For Us