Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Find the "groups" of tuples [duplicate]

Tags:

r

I try to find the "group" (id3) based on two variables (id1, id2):

df = data.frame(id1 = c(1,1,2,2,3,3,4,4,5,5),
            id2 = c('a','b','a','c','c','d','x','y','y','z'),
            id3 = c(rep('group1',6), rep('group2',4)))


   id1 id2      id3
1    1   a   group1
2    1   b   group1
3    2   a   group1
4    2   c   group1
5    3   c   group1
6    3   d   group1
7    4   x   group2
8    4   y   group2
9    5   y   group2
10   5   z   group2   

For example id1=1 is related to a and b of id2. But id1=2 is also related to a so both belong to one group (id3=group1). But since id1=2 and id1=3 share id2=c, also id1=3 belongs to that group (id3=1). The values of the tuple ((1,2),('a','b','c')) appear no where else, so no other row belongs to that group (which is labeled group1 generically).

If you need to take care of NAs, check this similar post

My idea was to create a table based on id3 which would subsequently populated in a loop.

solution = data.frame(id3= c('group1', 'group2'),id1=NA, id2=NA)
group= 1 

for (step in c(1:1000)) { # run many steps to make sure to get all values
  solution$id1[group] = # populate  
  solution$id2[group] = # populate  

  if (fully populated) {
    group = group +1
  }} 

I am struggling to see how to populate.


Disclaimer: I asked a similar question here, but using names in id2 led a lot of people point me to fuzzy string procedures in R, which are not needed here, since there exist an exact solution. I also include all code I have tried since then in this post.

like image 994
safex Avatar asked Feb 18 '19 08:02

safex


1 Answers

You can leverage on igraph to find the different clusters of networks

library(igraph)
g <- graph_from_data_frame(df, FALSE)
cg <- clusters(g)$membership
df$id3 <- cg[df$id1]
df

output:

   id1 id2 id3
1    1   a   1
2    1   b   1
3    2   a   1
4    2   c   1
5    3   c   1
6    3   d   1
7    4   x   2
8    4   y   2
9    5   y   2
10   5   z   2
like image 84
chinsoon12 Avatar answered Nov 04 '22 22:11

chinsoon12