I am trying to make an id column using dplyr's group_by and cur_group_id() functions. This goes nicely, however I would like the cur_group_id() to 'restart' based on one of the grouping variables.
Example data:
df <- data.frame(
X = c(1,1,1,1,1,2),
Y = c(1,1,1,2,2,3),
Z = c(1,1,2,3,3,4)
)
# which looks like this
df
X Y Z
1 1 1
1 1 1
1 1 2
1 2 3
1 2 3
2 3 4
My current code and output:
library(dplyr)
library(magrittr)
df %<>%
group_by(X, Y, Z) %>%
mutate(ID = cur_group_id()) %>%
ungroup()
df
X Y Z ID
1 1 1 1
1 1 1 1
1 1 2 2
1 2 3 3
1 2 3 3
2 3 4 4
However, I would like the ID counter to restart as soon as it hits a new value of X like this:
df
X Y Z ID
1 1 1 1
1 1 1 1
1 1 2 2
1 2 3 3
1 2 3 3
2 3 4 1
Is there a way to solve this nicely? Thank you in advance.
Since you want to restart ID
for each X
, you can group_by
X
and create unique id for each unique value of Y
and Z
.
library(dplyr)
df %>%
group_by(X) %>%
mutate(ID = match(paste(Y, Z), unique(paste(Y, Z))))
# X Y Z ID
# <dbl> <dbl> <dbl> <int>
#1 1 1 1 1
#2 1 1 1 1
#3 1 1 2 2
#4 1 2 3 3
#5 1 2 3 3
#6 2 3 4 1
In base R, you can use ave
similarly :
df$ID <- with(df, ave(paste(Y, Z), X, FUN = function(x) match(x, unique(x))))
If you want to use cur_group_id()
specifically, you can split the data for each value of X
and apply cur_group_id
for each dataframe.
df %>%
group_split(X) %>%
purrr::map_df(~.x %>% group_by(Y, Z) %>% mutate(ID = cur_group_id()))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With