Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tricky grouping of pairs from columns to rows

Tags:

r

tidyverse

I have pairs representing genetically identical individuals, in a table. I will use letters for the pairs. For example, a, x, y and b are the same individual!

Mate1    Mate2
a        x
a        y
b        y
c        z
d        l
d        j
d        m
j        n
f        o
f        p
f        q
f        r

As you can see, Mate1 can have multiple matches in Mate2, and vice versa. I would like to obtain this:

Mate1    Mate2    Mate3    Mate4    Mate5   
a        x        y         b           
c        z                  
d        l        m        j        n       
f        o        p        q        r

The idea is: I want one row per group of individuals, but sometimes this involves linking pairs by Mate1 or by Mate2, several times. Example: a is linked to b by the intermediate of y. In my real dataset, you could have potentially many more intermediates like y. I would like all of them to be in one row (or adding a new column with a 'group' ID if it is easier).

Any ideas of how to do that? Many thanks!

I tried already lots of combinations of tidyverse functions like spread, unite, group by etc but without success. I struggle to get something robust and complete.

like image 885
Nonopov Avatar asked Mar 04 '23 10:03

Nonopov


1 Answers

You can use the igraph package for this task:

sort(clusters(graph.data.frame(df, directed = FALSE))$membership)

a b x y c z d j l m n f o p q r 
1 1 1 1 2 2 3 3 3 3 3 4 4 4 4 4 

If you want to further match your desired output, you can add dplyr and tidyr:

pairs <- sort(clusters(graph.data.frame(df, directed = FALSE))$membership)

pairs %>%
 enframe() %>%
 group_by(value) %>%
 mutate(variable = paste0("Mate", 1:n())) %>%
 ungroup() %>%
 spread(variable, name) %>%
 select(-value)

  Mate1 Mate2 Mate3 Mate4 Mate5
  <chr> <chr> <chr> <chr> <chr>
1 a     b     x     y     <NA> 
2 c     z     <NA>  <NA>  <NA> 
3 d     j     l     m     n    
4 f     o     p     q     r   
like image 89
tmfmnk Avatar answered Mar 19 '23 05:03

tmfmnk