Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove small communities using igraph in R?

I have created my igraph from my dataset "allgenes", and found community modules based on the louvain method.

gD <- igraph::simplify(igraph::graph.data.frame(allgenes, directed=FALSE))
lou <- cluster_louvain(gD)

Plotting the modules, I note that there are several small communities that I wish to remove. How would I remove communities containing 5 nodes or less?

plot(lou, gD, vertex.label = NA, vertex.size=5, edge.arrow.size = .2)

Plot with distinguished modules:

img

like image 898
Alicia Sara Davis Avatar asked Jul 10 '18 18:07

Alicia Sara Davis


2 Answers

Since you do not provide an example, I will illustrate with randomly generated data.

## First create an example like yours
library(igraph)
set.seed(123)
gD = erdos.renyi.game(50,0.05)
lou <- cluster_louvain(gD)
LO = layout_with_fr(gD)
plot(lou, gD, vertex.label = NA, vertex.size=5, 
    edge.arrow.size = .2, layout=LO)

G1

## identify which communities have fewer than 5 members
Small = which(table(lou$membership) < 5)

## Which nodes should be kept?
Keep = V(gD)[!(lou$membership %in% Small)]

## Get subgraph & plot
gD2  = induced_subgraph(gD, Keep)
lou2 = cluster_louvain(gD2)
LO2 = LO[Keep,]
plot(lou2, gD2, vertex.label = NA, vertex.size=5, 
    edge.arrow.size = .2, layout=LO2)

G2

The small communities have been removed

like image 146
G5W Avatar answered Nov 01 '22 20:11

G5W


If you want to remove communities while maintaining the other existing communities you cannot create an induced subgraph with vertices you want to keep and cluster on the subgraph because the resulting communities can very likely change.

A workable approach would be to manually subset the communities object.

Also, if you want to plot the original graph and communities and new ones and maintain the same colors everywhere you have to do a couple additional steps.

suppressPackageStartupMessages(library(igraph))
set.seed(123)

g <- erdos.renyi.game(50, 0.05)
c <- cluster_louvain(g)
l <- layout_with_fr(g)
c_keep_ids <- as.numeric(names(sizes(c)[sizes(c) >= 5]))
c_keep_v_idxs <- which(c$membership %in% c_keep_ids)

g_sub <- induced_subgraph(g, V(g)[c_keep_v_idxs])
# igraph has no direct functionality to subset community objects so hack it
c_sub <- c
c_sub$names <- c$names[c_keep_v_idxs]
c_sub$membership <- c$membership[c_keep_v_idxs]
c_sub$vcount <- length(c_sub$names)
c_sub$modularity <- modularity(g_sub, c_sub$membership, E(g_sub)$weight)

par(mfrow = c(1, 2))
plot(c, g,
  layout = l,
  vertex.label = NA,
  vertex.size = 5
 )
plot(c_sub, g_sub,
  col = membership(c)[c_keep_v_idxs],
  layout = l[c_keep_v_idxs, ],
  mark.border = rainbow(length(communities(c)), alpha = 1)[c_keep_ids],
  mark.col = rainbow(length(communities(c)), alpha = 0.3)[c_keep_ids],
  vertex.label = NA,
  vertex.size = 5
)
par(mfrow = c(1, 1))

communities plots

like image 28
hermidalc Avatar answered Nov 01 '22 21:11

hermidalc