I am trying to create large graph via graph-tool
library (near 10^6 - 10^7 vertices) and fill vertex property with vertex name or use names instead of vertex indexes. I have:
list of names:
['50', '56', '568']
set of edges, but instead of vertex indexes it consists of their names:
edge_list = {frozenset({'568', '56'}), frozenset({'56', '50'}), frozenset({'50', '568'})}
Since add_edge_list()
allows to create vertices if they are no such vertix in the graph. I'm trying to use it to fill an empty graph. It works ok, but when I was trying to get vertex by its name, I got an error that there are no vertex with such index.
Here is the code of my program:
g = grt.Graph(directed=False)
edge_list = {frozenset({'568', '56'}), frozenset({'56', '50'}), frozenset({'50', '568'})}
ids = ['50', '56', '568']
g.add_edge_list(edge_list, hashed=True, string_vals=True)
print(g.vertex('50'))
The error message of print(g.vertex('50'))
:
ValueError: Invalid vertex index: 50
I want to create graph:
edge_list
only;Is there any good way to do this?
EDIT: Current code:
g = grt.Graph(directed=False)
g.add_vertex(len(ids))
vprop = g.new_vertex_property("string", vals=ids)
g.vp.user_id = vprop
for vert1, vert2 in edges_list:
g.add_edge(g.vertex(ids_dict[vert1]), g.vertex(ids_dict[vert2]))
If you have a dense graph with 10^6 - 10^7 vertices (Is it some medical data or social graph? It can change everything), you shouldn't use networkx
because it is written on pure Python so it is ~10-100 times slower than graph-tool
or igraph
. In your case I recommend you to use graph-tool
. It is the fastest (~as igraph
) Python graph processing library.
graph-tool
behaviour differs from networkx
. When you create the networkx
node, its identifier is what you wrote in node constructor so you can get the node by its ID. In graph-tool every vertex ID is the integer from 1 to GRAPH_SIZE:
Each vertex in a graph has an unique index, which is always between 0 and N−1, where N is the number of vertices. This index can be obtained by using the vertex_index attribute of the graph (which is a property map, see Property maps), or by converting the vertex descriptor to an int.
Every additional information about graph, vertices or edges is stored in property maps. And when you are using .add_edge_list()
with hashed=True
, the new property map is returned as the result of .add_edge_list()
. So in your case you should handle your vertices like this:
# Create graph
g = grt.Graph(directed=False)
# Create edge list
# Why frozensets? You don't really need them. You can use ordinary sets or tuples
edge_list = {
frozenset({'568', '56'}),
frozenset({'56', '50'}),
frozenset({'50', '568'})
}
# Write returned PropertyMap to a variable!
vertex_ids = g.add_edge_list(edge_list, hashed=True, string_vals=True)
g.vertex(1)
Out [...]: <Vertex object with index '1' at 0x7f3b5edde4b0>
vertex_ids[1]
Out [...]: '56'
If you want to get a vertex according to the ID, you should construct mapping dict manually (well, I am not a graph-tool
guru, but I can't find simple solution):
very_important_mapping_dict = {vertex_ids[i]: i for i in range(g.num_vertices())}
So you can easily get a vertex index:
very_important_mapping_dict['568']
Out [...]: 0
vertex_ids[0]
Out [...]: '568'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With