I have an enormous graph dataset - let's say it is like this, but on a much bigger level:
1 -> 2 3 -> 4
1,2,3,4 are nodes and the arrows are directed edges. Let's say that they are all in a single graph object:
import networkx as nx G = nx.DiGraph() G.add_nodes_from([1,2,3,4]) G.add_edge(1,2) G.add_edge(3,4)
Given an object like this, which has two mini graphs within a graph, how can we pull out each mini graph? I feel like there must be some word for this? My end result would look like:
for mini_graph in G: print mini_graph.nodes() ... [1,2] [3,4]
A subgraph view of the graph. The graph structure cannot be changed but node/edge attributes can and are shared with the original graph. Notes. The graph, edge and node attributes are shared with the original graph.
nbunch. An nbunch is a single node, container of nodes or None (representing all nodes). It can be a list, set, graph, etc..
MultiGraph (data=None, **attr)[source] An undirected graph class that can store multiedges. Multiedges are multiple edges between two nodes. Each edge can hold optional data or attributes. A MultiGraph holds undirected edges.
A multidigraph is a directed graph which is permitted to have multiple arcs, i.e., arcs with the same source and target nodes. A multidigraph G is an ordered pair G := (V, A) with. V a set of vertices or nodes, A a multiset of ordered pairs of vertices called directed edges, arcs or arrows.
If the parts of the graph are truly disjoint (as per your small example), then consider extracting the subgraphs with connected_component_subgraphs()
.
This only works on an undirected graph, so if you are using a directed graph then you'll need to convert to undirected first.
import networkx as nx G = nx.DiGraph() G.add_nodes_from([1,2,3,4]) G.add_edge(1,2) G.add_edge(3,4) # make an undirected copy of the digraph UG = G.to_undirected() # extract subgraphs sub_graphs = nx.connected_component_subgraphs(UG) for i, sg in enumerate(sub_graphs): print "subgraph {} has {} nodes".format(i, sg.number_of_nodes()) print "\tNodes:", sg.nodes(data=True) print "\tEdges:", sg.edges()
which yields:
subgraph 1 has 2 nodes Nodes: [(1, {}), (2, {})] Edges: [(1, 2)] subgraph 1 has 2 nodes Nodes: [(3, {}), (4, {})] Edges: [(3, 4)]
and you could use the subgraph node labels to operate on your data in the initial graph,
sg.nodes()[0] in G >>> True
Reading the answer linked by EdChum, it appears that weakly_connected_component_subgraphs()
operates on a directed graph but treats it as undirected, so saving the copy might be crucial. However, the docs on this and the related function weakly_connected_components()
are a bit thin at present.
As of 2018 the above answer is deprecated (link to docs). You are advised to use:
(G.subgraph(c) for c in connected_components(G))
or
(G.subgraph(c).copy() for c in connected_components(G))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With