Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a string and check if all elements are unique in R

Tags:

string

split

r

I am trying to split a string and then check if all its elements are unique. So far I have done the following:

df <- data.frame(
"id" = c("001;002;003", "001;001;001", "001;001;002"),
"v1" = c("a", "b", "c")
)

sapply(strsplit(df$id, ";"), function(x) unique(x) %in% df$id)

My intention is to have an output of FALSE, TRUE, and FALSE, respectively for "001;002;003", "001;001;001", and "001;001;002"

Thanks in advance!!

like image 765
monteromati Avatar asked Mar 01 '23 11:03

monteromati


2 Answers

We can check on the unique length

sapply(strsplit(df$id, ";"), function(x) length(unique(x))== 1)
[1] FALSE  TRUE FALSE

or with n_distinct from dplyr

library(dplyr)
library(tidyr)
df %>% 
   separate_rows(id) %>% 
   group_by(v1) %>%
   summarise(flag = n_distinct(id) == 1)

-output

# A tibble: 3 x 2
  v1    flag 
  <chr> <lgl>
1 a     FALSE
2 b     TRUE 
3 c     FALSE
like image 132
akrun Avatar answered Mar 04 '23 00:03

akrun


Or using table

sapply(strsplit(df$id, ";"), function(x) length(table(x)) == 1)
[1] FALSE  TRUE FALSE
like image 22
Martin Gal Avatar answered Mar 04 '23 02:03

Martin Gal