I am having trouble with large graph visualization in python and networkx. The graph is wish to visualize is directed, and has an edge and vertex set size of 215,000 From the documenation (which is linked at the top page) it is clear that networkx supports plotting with matplotlib
and GraphViz. In matplotlib
and networkx the drawing is done as follows:
import
networkx as nx
import matplotlib.pyplot as plt
#Let g be a graph that I created
nx.draw(g)
I get a memory error after nx.draw(g)
, afterwards you would normally do plt.show()
or plt.[some_function] to save the file in a format for efficient and so forth.
Next I tried GraphViz. From the wikipedia page the dot
format is used for directed graphs and I created a dot file:
nx.write_dot(g, "g.dot")
This worked well and I had a dot file in my current directory that is 12 megabytes. Next I ran the dot
program (part of graphviz to create a postscript file):
dot -Tps g.dot -o g.ps
This slows down my computer, runs for a few minutes and then display Killed
in the terminal. So it never could execute... While reading the documentation for graphviz it seems that only undirected graphs were supported for large graph visualization.
Question: With these two unsuccessful attempts can anyone show me how to visualize my large graph using python and networkx with about 215,000 vertices and 215,000 edges? I suspect as with Graphviz I will have to output into an intermediate format (although this shouldn't be that hard it won't be as easy as a builtin function) and then use another tool to read the intermediate format and then output a visualization.
So, I am looking for the following:
If you need more information let me know!
For NetworkX, a graph with more than 100K nodes may be too large. I'll demonstrate that it can handle a network with 187K nodes in this post, but the centrality calculations were prolonged. Luckily, there are some other packages available to help us with even larger graphs.
Option 1: NetworkX NetworkX has its own drawing module which provides multiple options for plotting. Below we can find the visualization for some of the draw modules in the package. Using any of them is fairly easy, as all you need to do is call the module and pass the G graph variable and the package does the rest.
NetworkX is pure Python, well documented and handles changes to the network gracefully. iGraph is more performant in terms of speed and ram usage but less flexible for dynamic networks. iGraph is a C library with very smart indexing and storage approaches so you can load pretty large graphs in ram.
from matplotlib import pylab
import networkx as nx
def save_graph(graph,file_name):
#initialze Figure
plt.figure(num=None, figsize=(20, 20), dpi=80)
plt.axis('off')
fig = plt.figure(1)
pos = nx.spring_layout(graph)
nx.draw_networkx_nodes(graph,pos)
nx.draw_networkx_edges(graph,pos)
nx.draw_networkx_labels(graph,pos)
cut = 1.00
xmax = cut * max(xx for xx, yy in pos.values())
ymax = cut * max(yy for xx, yy in pos.values())
plt.xlim(0, xmax)
plt.ylim(0, ymax)
plt.savefig(file_name,bbox_inches="tight")
pylab.close()
del fig
#Assuming that the graph g has nodes and edges entered
save_graph(g,"my_graph.pdf")
#it can also be saved in .svg, .png. or .ps formats
This answers your first issue. Networkx does not have the facility to zoom into nodes. Use Gephi for this functionality. Gephi accepts an edge list in CSV format and produces a visualization, where zooming can be done interactively.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With