Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using iGraph in python for community detection and writing community number for each node to CSV

I have an network that I would like to analyze using the edge_betweenness community detection algorithm in iGraph. I'm familiar with NetworkX, but am trying to learning iGraph because of it's additional community detection methods over NetworkX.

My ultimate goal is to run edge_betweenness community detection and find the optimal number of communities and write a CSV with community membership for each node in the graph.

Below is my code as it currently stands. Any help figuring out community membership is greatly appreciated.

input data ('network.txt'):

1 2
2 3
2 7
3 1
4 2
4 6
5 4
5 6
7 4
7 8
8 9
9 7
10 7
10 8
10 9

iGraph code

import igraph

# load data into a graph
g = igraph.Graph.Read_Ncol('network.txt')

# plot graph
igraph.plot(g)

igraph.plot(g)

# identify communities
communities = igraph.community_edge_betweenness()

# not really sure what to do next
num_communities = communities.optimal_count
communities.as_clustering(num_communities)

What do I need to do to find the optimal number of communities and write which community each node in the graph belongs to a list?

like image 214
CurtLH Avatar asked Aug 11 '14 23:08

CurtLH


People also ask

What is the best community detection algorithm?

Community detection algorithms are used to find such groups of densely connected components in various networks. M. Girvan and M. E. J. Newman have proposed one of the most widely adopted community detection algorithms, the Girvan-Newman algorithm.

Why it is useful to discover communities in networks?

Community detection techniques are useful for social media algorithms to discover people with common interests and keep them tightly connected. Community detection can be used in machine learning to detect groups with similar properties and extract groups for various reasons.

Which method is used to find the communities from a social network graph?

Community detection, also called graph clustering, is one of the most fundamental and vital complex network analysis techniques used to illustrate the structure of the relationship of network nodes.

Which method consider the loosely connected node in the network during community detection?

The Louvain method for community detection is an algorithm for detecting communities in networks. It maximizes a modularity score for each community, where the modularity quantifies the quality of an assignment of nodes to communities.


1 Answers

You are on the right track; the optimal number of communities (where "optimal" is defined as "the number of communities that maximizes the modularity score) can be retrieved by communities.optimal_count and the community structure can be converted into a flat disjoint clustering using communities.as_clustering(num_communities). Actually, the number of communities can be omitted if it happens to be equal to communities.optimal_count. Once you've done that, you get a VertexClustering object with a membership property which gives you the cluster index for each vertex in the graph.

For sake of clarity, I'm renaming your communities variable to dendrogram because the edge betweenness community detection algorithm actually produces a dendrogram::

# calculate dendrogram
dendrogram = graph.community_edge_betweenness()
# convert it into a flat clustering
clusters = dendrogram.as_clustering()
# get the membership vector
membership = clusters.membership

Now we can start writing the membership vector along with the node names into a CSV file::

import csv
from itertools import izip

writer = csv.writer(open("output.csv", "wb"))
for name, membership in izip(graph.vs["name"], membership):
    writer.writerow([name, membership])

If you are using Python 3, use zip instead of izip and there is no need to import itertools.

like image 112
Tamás Avatar answered Oct 29 '22 04:10

Tamás