Cluster groups of 1s in a binary matrix

Question

I'm looking to create clusters around all 1s and 0s. Similar to Mindsweeper, I want to basically "draw a circle" around all 1s, and create a border where 0s exist.

I have tried using hclust() and creating a distance matrix, but the actual table I am working with is very large, and I have run into problems with run time.

test_matrix  <- matrix(c( 1,1,0,0,0,0,1,     
                          1,1,1,0,0,1,0,
                          0,1,0,0,0,1,0,
                          0,0,0,1,1,1,0,
                          0,0,0,1,1,1,1),nrow=5)

Result looks like this:

     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]    1    0    0    1    0    1    0
[2,]    1    1    0    0    0    1    1
[3,]    0    1    1    0    0    0    1
[4,]    0    1    0    0    0    0    1
[5,]    0    1    0    1    1    0    1

My rules are as follows: If any 1 is connected to any 1 via UP, DOWN, LEFT, RIGHT, DIAGONAL(any direction), continue growing the "cluster". Based on these rules (8 points of connection for every point), I can spot four unique clusters with isolated 1s.

How would you code to find these groups?

AEF · Accepted Answer

I think clustering is the right approach here, but you choose a poor ( computationally expensive) method for the task. I would go for DBSCAN like this:

library(dbscan)

## slightly altered test matrix to include a "cluster" with a single 1
test_matrix  <- matrix(c( 1,1,0,0,0,0,1,     
                          1,1,1,0,0,1,0,
                          0,1,0,0,0,1,0,
                          0,0,0,1,1,1,0,
                          1,0,0,1,1,1,1),
                          nrow=5, byrow = TRUE)

## find rows and columns of 1s
ones_pos <- which(test_matrix > 0,arr.ind=TRUE)


## perform DBSCAN clustering
## setting eps = sqrt(2) + .1 corresponds to your neighbourhood definition
## setting minPts = 2 will mark clusters of one point as noise
clust <- dbscan(ones_pos, eps = sqrt(2), minPts = 2)

## find the indices of noise elements
singular_ones <- ones_pos[clust$cluster == 0, ]

singular_ones
#> row col 
#>  5   1

To find all clusters (including those that just consist of one 1) just set minPts to 1. In this case there can be no noise. The cluster membership is stored in clust$cluster.

I am quite certain this approach will also be quite fast with large matrices.

Cluster groups of 1s in a binary matrix

Tags:

r

binary

matrix

Kyle

1 Answers

AEF

Recent Activity

Donate For Us

Cluster groups of 1s in a binary matrix

Tags:

r

binary

matrix

Kyle

1 Answers

AEF

Related questions

Recent Activity

Donate For Us