In R, indicate whether another column has more than one unique value per ID

Question

Background

Here's a dataset, d:

d <- data.frame(ID = c("a","a","b","b"),                  
                product_code = c("B78","X31","C12","C12"),                  
                stringsAsFactors=FALSE)

It looks like this:

enter image description here

The Problem and Desired Output

I'm trying to make an indicator column multiple_products that's marked 1 for IDs which have more than one unique product_code and 0 for those that don't. Here's what I'm looking for:

what_i_want

My attempts haven't worked yet, though.

What I've Tried

Here's my current code:

d <- d %>% 
  group_by(ID) %>%
  mutate(multiple_products = if_else(length(unique(d$product_code)) > 1, 1, 0)) %>%
  ungroup()

And this is the result:

what_i_have

Any thoughts?

akrun · Accepted Answer

The d$ should be taken out as this will extract the whole column by removing the group attributes. Also, there is n_distinct. In addition, there is no need for ifelse or if_else as logical values (TRUE/FALSE) can be directly coerced to 1/0 as these are storage values by either using as.integer or +

library(dplyr)
d %>% 
  group_by(ID) %>%
  mutate(multiple_products = +(n_distinct(product_code) > 1)) %>%
  ungroup()

-output

# A tibble: 4 x 3
  ID    product_code multiple_products
  <chr> <chr>                    <int>
1 a     B78                          1
2 a     X31                          1
3 b     C12                          0
4 b     C12                          0

Samet Sökel · Answer

solution with data.table;

library(data.table)

setDT(d)

d[,multiple_products:=rleid(product_code),by=ID][
    ,multiple_products:=ifelse(max(multiple_products)>1,1,0),by=ID]
    
d

output;

  ID    product_code multiple_products
  <chr> <chr>                    <int>
1 a     B78                          1
2 a     X31                          1
3 b     C12                          0
4 b     C12                          0

In R, indicate whether another column has more than one unique value per ID

Tags:

r

dplyr

logjammin

2 Answers

akrun

Samet Sökel

Recent Activity

Donate For Us

In R, indicate whether another column has more than one unique value per ID

Tags:

r

dplyr

logjammin

2 Answers

akrun

Samet Sökel

Related questions

Recent Activity

Donate For Us