Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if the value exists in any other columns with Tidyverse

Sample data

df <- diag(1:8) %>% 
  as_tibble() %>%  
  mutate(A = c(10, 5, 11, 4, 6, 9, 65, 8))

# A tibble: 8 x 9
     V1    V2    V3    V4    V5    V6    V7    V8     A
  <int> <int> <int> <int> <int> <int> <int> <int> <dbl>
1     1     0     0     0     0     0     0     0    10
2     0     2     0     0     0     0     0     0     5
3     0     0     3     0     0     0     0     0    11
4     0     0     0     4     0     0     0     0     4
5     0     0     0     0     5     0     0     0     6
6     0     0     0     0     0     6     0     0     9
7     0     0     0     0     0     0     7     0    65
8     0     0     0     0     0     0     0     8     8

The desired output would be

# A tibble: 8 x 10
     V1    V2    V3    V4    V5    V6    V7    V8     A result
  <int> <int> <int> <int> <int> <int> <int> <int> <dbl> <lgl> 
1     1     0     0     0     0     0     0     0    10 FALSE 
2     0     2     0     0     0     0     0     0     5 TRUE  
3     0     0     3     0     0     0     0     0    11 FALSE 
4     0     0     0     4     0     0     0     0     4 TRUE  
5     0     0     0     0     5     0     0     0     6 TRUE  
6     0     0     0     0     0     6     0     0     9 FALSE 
7     0     0     0     0     0     0     7     0    65 FALSE 
8     0     0     0     0     0     0     0     8     8 TRUE

AND/OR ... A takes the value that is non-zero value in any other columns, such that the output would be:

# A tibble: 8 x 9
     V1    V2    V3    V4    V5    V6    V7    V8     A
  <int> <int> <int> <int> <int> <int> <int> <int> <int>
1     1     0     0     0     0     0     0     0     1
2     0     2     0     0     0     0     0     0     2
3     0     0     3     0     0     0     0     0     3
4     0     0     0     4     0     0     0     0     4
5     0     0     0     0     5     0     0     0     5
6     0     0     0     0     0     6     0     0     6
7     0     0     0     0     0     0     7     0     7
8     0     0     0     0     0     0     0     8     8

How can this be achieved with tidyverse?

like image 287
HoelR Avatar asked Sep 11 '25 08:09

HoelR


2 Answers

Perhaps we can try

> df %>% mutate(result = A %in% unlist(select(., !A)))
# A tibble: 8 × 10
     V1    V2    V3    V4    V5    V6    V7    V8     A result
  <int> <int> <int> <int> <int> <int> <int> <int> <dbl> <lgl>
1     1     0     0     0     0     0     0     0    10 FALSE
2     0     2     0     0     0     0     0     0     5 TRUE
3     0     0     3     0     0     0     0     0    11 FALSE
4     0     0     0     4     0     0     0     0     4 TRUE
5     0     0     0     0     5     0     0     0     6 TRUE
6     0     0     0     0     0     6     0     0     9 FALSE
7     0     0     0     0     0     0     7     0    65 FALSE
8     0     0     0     0     0     0     0     8     8 TRUE
like image 161
ThomasIsCoding Avatar answered Sep 13 '25 21:09

ThomasIsCoding


You may try

library(dplyr)
df %>%
  mutate(result = rowSums(across(everything() & - "A", ~A %in% .x)) > 0)

     V1    V2    V3    V4    V5    V6    V7    V8     A result
  <int> <int> <int> <int> <int> <int> <int> <int> <dbl> <lgl> 
1     1     0     0     0     0     0     0     0    10 FALSE 
2     0     2     0     0     0     0     0     0     5 TRUE  
3     0     0     3     0     0     0     0     0    11 FALSE 
4     0     0     0     4     0     0     0     0     4 TRUE  
5     0     0     0     0     5     0     0     0     6 TRUE  
6     0     0     0     0     0     6     0     0     9 FALSE 
7     0     0     0     0     0     0     7     0    65 FALSE 
8     0     0     0     0     0     0     0     8     8 TRUE  

If starts with

     V1    V2    V3    V4    V5    V6    V7    V8     A
  <int> <int> <int> <int> <int> <int> <int> <int> <dbl>
1     1     0     0     0     0     0     0     0    10
2     0     2     0     0     0     0     0     0     5
3     0     0     3     0     0     0     0     0    11
4     0     0     0     4     0     0     0     0     4
5     0     0     0     0     5     0     0     0     6
6     0     0     0     0     0     6     0     0     9
7     0     0     0     0     0     0     7     0    65
8     0     0     0     0     0     0     0     8     8

then

df %>%
  mutate(A = rowSums(.) - A)

If starts with

     V1    V2    V3    V4    V5    V6    V7    V8
  <int> <int> <int> <int> <int> <int> <int> <int>
1     1     0     0     0     0     0     0     0
2     0     2     0     0     0     0     0     0
3     0     0     3     0     0     0     0     0
4     0     0     0     4     0     0     0     0
5     0     0     0     0     5     0     0     0
6     0     0     0     0     0     6     0     0
7     0     0     0     0     0     0     7     0
8     0     0     0     0     0     0     0     8

then

df %>%
  mutate(A = rowSums(.))
like image 30
Park Avatar answered Sep 13 '25 22:09

Park