Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Test all values against each other and form groups from resulting matrix

Tags:

r

matrix

grouping

I feel as if I'm asking the wrong questions and trying to reinvent the wheel. What am I missing?

I have a bunch of values, lets say 8, that I need to test against each other. I have built a function that returns a matrix stating whether any two values are in a group or not. For the lack of a better idea, let me paste the output here:

    data.text <- 
"1     2     3     4     5     6     7     8
1  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
2  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
3  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
4 FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
5 FALSE FALSE FALSE FALSE  TRUE  TRUE    NA FALSE
6 FALSE FALSE FALSE FALSE  TRUE  TRUE    NA FALSE
7 FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
8 FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE"

data <- read.table(text=data.text, header = TRUE)
data <- as.matrix(data)
colnames(data) <- 1:8

So the row 1 says that value 1 is in a group with itself (column 1) and with value 2 and 3, but not with values 4 - 8. Values 5 and 6 are within the same group as well.

I am trying to use this information to create individual group IDs and a vector of all elements in that group:

  • Group1: 1,2,3
  • Group2: 5,6

What I've done so far:

# row and column index for all TRUE values by row
groups <- which(data,arr.ind = T)

# sort each row in acending order in order to find duplicate values
groups.sorted  <- t(apply(groups,1,sort))

# drop double statments ("1 and 2", "2 and 1")
groups.unique <- unique(groups.sorted)

# drop obivous information ("1 and 1")
groups.real <- groups.unique[groups.unique[,1] != groups.unique[,2],]

At this point I'm stuck. How do I automate the fact that rows 1, 2 and 3 belong to the same group?

All in all, I feel I'm going at this rather clumsily. Can anybody point me to a more elegant way?

like image 890
Ratnanil Avatar asked Dec 10 '22 19:12

Ratnanil


1 Answers

I'd use the igraph package for this sort of things:

require(igraph)
components(graph_from_adjacency_matrix(data))$membership
#1 2 3 4 5 6 7 8 
#1 1 1 2 3 3 4 5

You obtain a named vector whose names are the elements and the values are the group they belong.

like image 54
nicola Avatar answered Jan 18 '23 03:01

nicola