Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

igraph: why is add_edge function so slow ompared to add_edges?

Tags:

python

igraph

i am surprised that:

import igraph
import random, time
start_time = time.time()
G = igraph.Graph(directed = True)
G.add_vertices(10000)
for i in range(30000):
   G.add_edge(random.randint(0,9999), random.randint(0,9999))
print "done in " + str(int(time.time() - start_time)) + " seconds"

returns done in 63 seconds

while

import igraph
import random, time
start_time = time.time()
G = igraph.Graph(directed = True)
G.add_vertices(10000)
edges = []
for i in range(30000):
    edges += [(random.randint(0,9999), random.randint(0,9999))]
G.add_edges(edges)
print "done in " + str(int(time.time() - start_time)) + " seconds"

returns done in 0 seconds. Does someone from the project know why?

like image 645
Mermoz Avatar asked Dec 20 '12 14:12

Mermoz


1 Answers

The reason is that igraph uses an indexed edge list as its data structure in the C layer. The index makes it possible to query the neighbors of a specific vertex in constant time. This is good if your graph rarely changes, but it becomes a burden when the modification operations are far more frequent than the queries, since whenever you add or remove an edge, you have to update the index. So, every call to add_edge will make igraph reindex its internal data structures. The upside is that if you have to rebuild the index anyway, you can just as well add many edges using add_edges with approximately the same cost. So, in your first code example, you rebuild the index 30000 times, while in the second example, you rebuild the index only once.

By the way, what you are doing could be done even faster using Graph.Erdos_Renyi(n=10000, m=30000).

like image 195
Tamás Avatar answered Oct 16 '22 05:10

Tamás