Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Community detection with InfoMap algorithm producing one massive module

Tags:

r

igraph

sna

I am using the InfoMap algorithm in the igraph package to perform community detection on a directed and non-weighted graph (34943 vertices, 206366 edges). In the graph, vertices represent websites and edges represent the existence of a hyperlink between websites.

A problem I have encountered after running the algorithm is that the majority of vertices have a membership in a single massive community (32920 or 94%). The rest of the vertices are dispersed into hundreds of other tiny communities.

I have tried different settings with the nb.trials parameter (i.e. 50, 100, and now running 500). However, this doesn't seem to change the result much.

I am feeling rather exasperated because the run-time on the algorithm is quite high, so I have to wait each time for the results (with no luck yet!!).

Many thanks.

like image 347
timothyjgraham Avatar asked Dec 04 '13 01:12

timothyjgraham


People also ask

What is infomap community detection?

The community detection problem in Infomap is to find vertices. sets, named communities (or modules), which contain high intra- module information flow but low inter-module flow.

What is community detection in machine learning?

Community Detection is one of the fundamental problems in network analysis, where the goal is to find groups of nodes that are, in some sense, more similar to each other than to the other nodes. Source: Randomized Spectral Clustering in Large-Scale Stochastic Block Models.

How is community detection different from clustering?

Clustering mostly focuses on a single modality, e.g., using node attributes to group network objects, whereas community detection focuses on network structure as a function of connectivity involving social interaction.

What is community network detection?

Community detection, also called graph partition, helps us to reveal the hidden relations among the nodes in the network. Many algorithms have been developed to detect communities (Clauset et al., 2004; Girvan and Newman, 2002; Lancichinetti and Fortunato, 2009).


1 Answers

Thanks for all the excellent comments. In the end, I got it working by downloading and running the source code for Infomap, which is available at: http://www.mapequation.org/code.html.

Due to licence issues, the latest code has not been integrated with igraph.

This solved the problem of too many nodes being 'lumped' into a single massive community.

Specifically, I used the following options from the command line: -N 10 --directed --two-level --map

Kudos to Martin Rosvall from the Infomap project for providing me with detailed help to resolve this problem.

For the interested reader, here is more information about this issue:

When a network collapses into one major cluster, it is most often because of a very dense and random link structure ... In the code for directed networks implemented in iGraph, teleportation is encoded. If many nodes have no outlinks, the effect of teleportation can be significant because it randomly connect nodes. We have made new code available here: http://www.mapequation.org/code.html that can cluster network without encoding the random teleportation necessary to make the dynamics ergodic. For details, see this paper: http://pre.aps.org/abstract/PRE/v85/i5/e056107

like image 146
timothyjgraham Avatar answered Nov 05 '22 15:11

timothyjgraham