R: Create dummy if column includes duplicate given group

Question

I would like to create a dummy variable that takes the value 1 if an individual is observed in two or more different age groups and 0 otherwise.

Is someone able to do that and could explain it to me?

A small example could be:

set.seed(123)
df <- data.frame(id = sample(1:10, 30, replace = TRUE),
             agegroup = sample(c("5054", "5559", "6065"), 30, replace = TRUE))

And expected output:

id  agegroup    dummy
 3     6065       1
 8     6065       1
 5     6065       1
 9     6065       1
10     5054       1
 1     5559       0
 6     6065       1
 9     5054       1
 6     5054       1
 5     5054       1
10     5054       1
 5     5559       1
 7     5559       1
 6     5559       1
 2     5054       1
 9     5054       1
 3     5054       1
 1     5559       0
 4     5054       0
10     6065       1
 9     5054       1
 7     5559       1
 7     6065       1
10     5054       1
 7     5559       1
 8     5054       1
 6     5054       1
 6     6065       1
 3     6065       1
 2     5559       1

MKR · Accepted Answer

An option is to use dplyr::group_by(id) and count unique agegroup. Your data contains duplicate rows for id and agegroup combination.

Edit: Updated with comments from @Henrik

library(dplyr)

df %>% group_by(id) %>%
  mutate(dummy = as.integer(n_distinct(agegroup) > 1))    

# # A tibble: 30 x 3
# # Groups: id [10]
#      id agegroup dummy
#   <int> <fctr>   <int>
# 1     3 6065         1
# 2     8 6065         1
# 3     5 6065         1
# 4     9 6065         1
# 5    10 5054         1
# 6     1 5559         0
# 7     6 6065         1
# 8     9 5054         1
# 9     6 5054         1
# 10     5 5054         1
# # ... with 20 more rows

R: Create dummy if column includes duplicate given group

Tags:

dataframe

r

maaas

1 Answers

MKR

Recent Activity

Donate For Us

R: Create dummy if column includes duplicate given group

Tags:

dataframe

r

maaas

1 Answers

MKR

Related questions

Recent Activity

Donate For Us