Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I "restart" cur_group_id() in R [duplicate]

Tags:

r

dplyr

I am trying to make an id column using dplyr's group_by and cur_group_id() functions. This goes nicely, however I would like the cur_group_id() to 'restart' based on one of the grouping variables.

Example data:

df <- data.frame(
    X = c(1,1,1,1,1,2),
    Y = c(1,1,1,2,2,3),
    Z = c(1,1,2,3,3,4)
)
# which looks like this
df
X  Y  Z
1  1  1
1  1  1
1  1  2
1  2  3
1  2  3
2  3  4

My current code and output:

library(dplyr)
library(magrittr)
df %<>% 
    group_by(X, Y, Z) %>%
    mutate(ID = cur_group_id()) %>%
    ungroup()

df
X  Y  Z  ID
1  1  1  1
1  1  1  1
1  1  2  2
1  2  3  3
1  2  3  3
2  3  4  4

However, I would like the ID counter to restart as soon as it hits a new value of X like this:

df
X  Y  Z  ID
1  1  1  1
1  1  1  1
1  1  2  2
1  2  3  3
1  2  3  3
2  3  4  1

Is there a way to solve this nicely? Thank you in advance.

like image 672
Slaatje Avatar asked Jan 24 '23 17:01

Slaatje


1 Answers

Since you want to restart ID for each X, you can group_by X and create unique id for each unique value of Y and Z.

library(dplyr)
df %>%
  group_by(X) %>%
  mutate(ID = match(paste(Y, Z), unique(paste(Y, Z))))

#    X     Y     Z    ID
#  <dbl> <dbl> <dbl> <int>
#1     1     1     1     1
#2     1     1     1     1
#3     1     1     2     2
#4     1     2     3     3
#5     1     2     3     3
#6     2     3     4     1

In base R, you can use ave similarly :

df$ID <- with(df, ave(paste(Y, Z), X, FUN = function(x) match(x, unique(x))))

If you want to use cur_group_id() specifically, you can split the data for each value of X and apply cur_group_id for each dataframe.

df %>%
  group_split(X) %>%
  purrr::map_df(~.x %>% group_by(Y, Z) %>% mutate(ID = cur_group_id()))
like image 172
Ronak Shah Avatar answered Jan 29 '23 20:01

Ronak Shah