I have an network that I would like to analyze using the <code>edge_betweenness</code> community detection algorithm in iGraph. I'm familiar with NetworkX, but am trying to learning iGraph because of it's additional community detection methods over NetworkX. My ultimate goal is to run <code>edge_betweenness</code> community detection and find the optimal number of communities and write a CSV with community membership for each node in the graph. Below is my code as it currently stands. Any help figuring out community membership is greatly appreciated. input data ('network.txt'): <pre class="prettyprint"><code>1 2 2 3 2 7 3 1 4 2 4 6 5 4 5 6 7 4 7 8 8 9 9 7 10 7 10 8 10 9 </code></pre> iGraph code <pre class="prettyprint"><code>import igraph # load data into a graph g = igraph.Graph.Read_Ncol('network.txt') # plot graph igraph.plot(g) </code></pre> <img src="https://i.stack.imgur.com/YWlgy.png" alt="igraph.plot(g)"> <pre class="prettyprint"><code># identify communities communities = igraph.community_edge_betweenness() # not really sure what to do next num_communities = communities.optimal_count communities.as_clustering(num_communities) </code></pre> What do I need to do to find the optimal number of communities and write which community each node in the graph belongs to a list?

You are on the right track; the optimal number of communities (where "optimal" is defined as "the number of communities that maximizes the modularity score) can be retrieved by <code>communities.optimal_count</code> and the community structure can be converted into a flat disjoint clustering using <code>communities.as_clustering(num_communities)</code>. Actually, the number of communities can be omitted if it happens to be equal to <code>communities.optimal_count</code>. Once you've done that, you get a <code>VertexClustering</code> object with a <code>membership</code> property which gives you the cluster index for each vertex in the graph. For sake of clarity, I'm renaming your <code>communities</code> variable to <code>dendrogram</code> because the edge betweenness community detection algorithm actually produces a dendrogram:: <pre class="prettyprint"><code># calculate dendrogram dendrogram = graph.community_edge_betweenness() # convert it into a flat clustering clusters = dendrogram.as_clustering() # get the membership vector membership = clusters.membership </code></pre> Now we can start writing the membership vector along with the node names into a CSV file:: <pre class="prettyprint"><code>import csv from itertools import izip writer = csv.writer(open("output.csv", "wb")) for name, membership in izip(graph.vs["name"], membership): writer.writerow([name, membership]) </code></pre> If you are using Python 3, use <code>zip</code> instead of <code>izip</code> and there is no need to import <code>itertools</code>.

Using iGraph in python for community detection and writing community number for each node to CSV

Tags:

python

igraph

hierarchical-clustering

I have an network that I would like to analyze using the edge_betweenness community detection algorithm in iGraph. I'm familiar with NetworkX, but am trying to learning iGraph because of it's additional community detection methods over NetworkX.

My ultimate goal is to run edge_betweenness community detection and find the optimal number of communities and write a CSV with community membership for each node in the graph.

Below is my code as it currently stands. Any help figuring out community membership is greatly appreciated.

input data ('network.txt'):

iGraph code

import igraph

# load data into a graph
g = igraph.Graph.Read_Ncol('network.txt')

# plot graph
igraph.plot(g)

igraph.plot(g)

# identify communities
communities = igraph.community_edge_betweenness()

# not really sure what to do next
num_communities = communities.optimal_count
communities.as_clustering(num_communities)

What do I need to do to find the optimal number of communities and write which community each node in the graph belongs to a list?

214

asked Aug 11 '14 23:08

CurtLH

1 Answers

You are on the right track; the optimal number of communities (where "optimal" is defined as "the number of communities that maximizes the modularity score) can be retrieved by communities.optimal_count and the community structure can be converted into a flat disjoint clustering using communities.as_clustering(num_communities). Actually, the number of communities can be omitted if it happens to be equal to communities.optimal_count. Once you've done that, you get a VertexClustering object with a membership property which gives you the cluster index for each vertex in the graph.

For sake of clarity, I'm renaming your communities variable to dendrogram because the edge betweenness community detection algorithm actually produces a dendrogram::

# calculate dendrogram
dendrogram = graph.community_edge_betweenness()
# convert it into a flat clustering
clusters = dendrogram.as_clustering()
# get the membership vector
membership = clusters.membership

Now we can start writing the membership vector along with the node names into a CSV file::

import csv
from itertools import izip

writer = csv.writer(open("output.csv", "wb"))
for name, membership in izip(graph.vs["name"], membership):
    writer.writerow([name, membership])

If you are using Python 3, use zip instead of izip and there is no need to import itertools.

112

answered Oct 29 '22 04:10

Tamás

Related questions
                            
                                Books for OpenCV and Python? [closed]
                            
                                Running OpenCV from a Python virtualenv
                            
                                Python Simple SSL Socket Server
                            
                                Python - How to change values in a list of lists?
                            
                                In python is there a way to know if an object "implements an interface" before I pass it to a function?
                            
                                Python imaplib fetch body emails gmail
                            
                                Maximize a function with many parameters (python)
                            
                                Handling tcpdump output in python
                            
                                How to plus one at the tail to a float number in Python?
                            
                                <Python, openCV> How I can use cv2.ellipse?
                            
                                How to test that a function is called within a function with nosetests
                            
                                csv writer in Python with custom quoting
                            
                                Flask hit decorator before before_request signal fires
                            
                                Does KMeans normalize features automatically in sklearn
                            
                                Python Call Parent Method Multiple Inheritance
                            
                                Subheadings for categories within matplotlib custom legend
                            
                                Why using integer as a key with pymongo doesn't work?
                            
                                Best way to get python and meteor talking [closed]
                            
                                Django rest framework user registration?
                            
                                Confidence interval for exponential curve fit

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With