I have the following dataset:
dataset <- data.frame(id = c("A","A","A","A","B","B","B,"B"),
                      value = c(1,1,2,3,5,6,6,7))
For every id that is duplicated, I want to flag the row where it happens, and this flag should be the same length of the dataframe source. This is the expected result:
id    value    flag
A     1        1
A     1        1
A     2        0
A     3        0
B     5        0
B     6        1
B     6        1
B     7        0
Is there a way where I don't have to use a for loop? Any help will be greatly appreciated.
We can use duplicated with and without  fromLast = TRUE to mark all the values that are repeated as 1.
dataset$flag <- as.integer(duplicated(dataset$value) | 
                           duplicated(dataset$value, fromLast = TRUE))
dataset
#  id value flag
#1  A     1    1
#2  A     1    1
#3  A     2    0
#4  A     3    0
#5  B     5    0
#6  B     6    1
#7  B     6    1
#8  B     7    0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With