I have an network that I would like to analyze using the edge_betweenness
community detection algorithm in iGraph. I'm familiar with NetworkX, but am trying to learning iGraph because of it's additional community detection methods over NetworkX.
My ultimate goal is to run edge_betweenness
community detection and find the optimal number of communities and write a CSV with community membership for each node in the graph.
Below is my code as it currently stands. Any help figuring out community membership is greatly appreciated.
input data ('network.txt'):
1 2
2 3
2 7
3 1
4 2
4 6
5 4
5 6
7 4
7 8
8 9
9 7
10 7
10 8
10 9
iGraph code
import igraph
# load data into a graph
g = igraph.Graph.Read_Ncol('network.txt')
# plot graph
igraph.plot(g)
# identify communities
communities = igraph.community_edge_betweenness()
# not really sure what to do next
num_communities = communities.optimal_count
communities.as_clustering(num_communities)
What do I need to do to find the optimal number of communities and write which community each node in the graph belongs to a list?
Community detection algorithms are used to find such groups of densely connected components in various networks. M. Girvan and M. E. J. Newman have proposed one of the most widely adopted community detection algorithms, the Girvan-Newman algorithm.
Community detection techniques are useful for social media algorithms to discover people with common interests and keep them tightly connected. Community detection can be used in machine learning to detect groups with similar properties and extract groups for various reasons.
Community detection, also called graph clustering, is one of the most fundamental and vital complex network analysis techniques used to illustrate the structure of the relationship of network nodes.
The Louvain method for community detection is an algorithm for detecting communities in networks. It maximizes a modularity score for each community, where the modularity quantifies the quality of an assignment of nodes to communities.
You are on the right track; the optimal number of communities (where "optimal" is defined as "the number of communities that maximizes the modularity score) can be retrieved by communities.optimal_count
and the community structure can be converted into a flat disjoint clustering using communities.as_clustering(num_communities)
. Actually, the number of communities can be omitted if it happens to be equal to communities.optimal_count
. Once you've done that, you get a VertexClustering
object with a membership
property which gives you the cluster index for each vertex in the graph.
For sake of clarity, I'm renaming your communities
variable to dendrogram
because the edge betweenness community detection algorithm actually produces a dendrogram::
# calculate dendrogram
dendrogram = graph.community_edge_betweenness()
# convert it into a flat clustering
clusters = dendrogram.as_clustering()
# get the membership vector
membership = clusters.membership
Now we can start writing the membership vector along with the node names into a CSV file::
import csv
from itertools import izip
writer = csv.writer(open("output.csv", "wb"))
for name, membership in izip(graph.vs["name"], membership):
writer.writerow([name, membership])
If you are using Python 3, use zip
instead of izip
and there is no need to import itertools
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With