Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python igraph vertex indices

I am using the igraph library in python. I would like to know if there is a way of using strings as vertices indices. I know about the 'name' property and that I can write

g = igraph.Graph(directed=True)
g.add_vertex('hello')
g.add_vertex('world')
g.add_edge('hello','world')

and everything works fine. Except that if I add the same vertex twice, e.g.:

g = igraph.Graph(directed=True)
g.add_vertex('world')
g.add_vertex('hello')
g.add_vertex('hello')

two distinct vertices are created and if I now add an edge:

g.add_edge('hello','world')

the edge is added to the first vertex matching 'hello' as a name. This also suggests that such form of indexing has O(n) complexity instead of O(1) (i.e. the whole list of vertices is scanned until a vertex v such that v['name'] == 'hello' is found).

So I was thinking about keeping a mapping between vertices names and indexes, for example:

mapping = {}
g = igraph.Graph(directed=True)
g.add_vertex('hello')
mapping['hello'] = len(g.vs)-1
g.add_vertex('world')
mapping['world'] = len(g.vs)-1
g.add_edge(mapping['hello'],mapping['world'])

I assume this should work since I never delete vertices so I guess the indices should remain constant. It also has average speed O(1) for lookup which should be better than the previous solution. However I was wondering:

  • am I always guaranteed that g.vs[i].index == i? (i.e. can I always use the position of a vertex in the vs array to refer to that vertex in functions like add_edge()?)
  • am I always guaranteed that when I add a new vertex to the graph its index is going to be len(g.vs)-1?

EDIT: Same questions about edges: am I guaranteed that I will find the last added edge in g.es[len(g.es)-1]?

like image 538
Simone Bronzini Avatar asked Apr 18 '15 10:04

Simone Bronzini


1 Answers

This also suggests that such form of indexing has O(n) complexity instead of O(1)

This is not true; igraph maintains an internal mapping from names to vertex IDs (just like the one you proposed) for the name vertex attribute, which is automatically updated whenever you add or delete vertices. If there are multiple vertices with the same name, the mapping selects an arbitrary vertex and then returns that one (consistently) for name lookups. Behind the scenes this is all done with a standard Python dictionary. So, you can safely do all the following:

  • use vertex names instead of vertex IDs whenever an igraph function or method requires a vertex ID
  • use g.vs.find("foo") to find an arbitrary vertex with name equal to "foo".

Note that we cannot prevent the user from creating multiple vertices with the same name as this is valid in many graph formats that igraph can read (e.g., GraphML) and we don't want to prevent the user from reading them.

am I always guaranteed that g.vs[i].index == i?

Yes, this is guaranteed to be true. However, the following is not:

>>> v = g.vs[12]
>>> g.delete_vertices(...)
>>> g.vs[v.index] == v

The reason is that vertex and edge objects are pretty "dumb" as they only store a reference to the graph they originate from and the index they have in the graph - but they are not updated when the graph itself is updated. The rule of thumb is that any vertex or edge object that you hold a reference to becomes "invalid" as soon as you mutate the underlying graph.

am I always guaranteed that when I add a new vertex to the graph its index is going to be len(g.vs)-1?

Strictly speaking, this is not guaranteed by the API (as a formal "contract"), but this has been the case so far from the beginning of igraph's development and I see no reason to change that any time in the future. I routinely rely on it in my own code as well. The same applies to edges.

like image 189
Tamás Avatar answered Oct 20 '22 14:10

Tamás