I'm using NetworkX to generate a bipartite graph using either nx.bipartite.random_graph
or nx.bipartite.gnmk_random_graph
, as follows:
B = bipartite.gnmk_random_graph(5,6,10)
bottom_nodes, top_nodes = bipartite.sets(B)
However, I get an error:
networkx.exception.AmbiguousSolution: Disconnected graph: Ambiguous solution for bipartite sets.
It is just a single line, so I'm not sure how I could be doing this wrong and why their package would be returning (what I assume is) an invalid bipartite graph.
Thanks.
EDIT: I just realised that I need to specify a minimum number of edges/probability for the third argument.
E.g. bipartite.random_graph(5,6,0.6)
and having p>0.5
gets rid of the error. Similarly, bipartite.gnmk_random_graph(5,6,11)
where k>n+m
. I didn't realise this was the case, as I assumed if the number of edges was lower than required to connect every vertex there would just be some floating vertices.
Thanks for your help!
SHORT ANSWER
You want to do
B = bipartite.gnmk_random_graph(5,6,10)
top = [node for node in B.nodes() if B.node[node]['bipartite']==0]
bottom = [node for node in B.nodes() if B.node[node]['bipartite']==1]
Explanation
So when you generate this bipartite graph, it is likely to be disconnected. Let's say it has 2 separate components X
and Y
. Both of these components are bipartite.
bipartite.sets(B)
is supposed to determine which sets are the two partitions of B
. But it's going to run into trouble.
Why?
X
can be broken into two partitions X_1
and X_2
and Y
can be broken into Y_1
and Y_2
. What about B
? Let top = X_1 + Y_1
and bottom = X_2 + Y_2
. This is a perfectly legitimate partition. But top = X_1+Y_2
and bottom = X_2+Y_1
is also a perfectly legitimate partition. Which one should it return? It's ambiguous. The algorithm explicitly refuses to make a choice. Instead it gives you an error.
What to do? You could throw out B
if it's disconnected and try again. But you're using B
for something right? Is it reasonable to restrict your attention only to the connected graphs? Maybe, maybe not. That's something you need to figure out. But it is not reasonable to restrict your attention only to the connected graphs if the reason is that disconnected graphs are inconvenient. You seem to hit this error more often than not, so a large fraction of the graphs are disconnected --- you're throwing out a large fraction of the cases. It seems that this is likely to bias the final outcome of whatever you're doing. (similarly, if you take steps to connect your network, you're no longer getting random graphs from the original distribution because well, you've ensured they aren't disconnected, and even worse now - you may not be uniformly sampling from the connected graphs).
So what's a better option? After looking at the source code, I found that this method isn't documented as well as it should be. It turns out that that for
B = bipartite.gnmk_random_graph(5,6,10)
nodes 0
up to 4
(the first five) are in the top, and nodes 5
up to 10
(the next 6) are in the bottom.
Alternately you can get it directly from the data that is encoded in the graph B
(not mentioned in the documentation). Try
B = bipartite.gnmk_random_graph(5,6,10)
B.nodes(data=True)
> NodeDataView({0: {'bipartite': 0}, 1: {'bipartite': 0}, 2: {'bipartite': 0}, 3: {'bipartite': 0}, 4: {'bipartite': 0}, 5: {'bipartite': 1}, 6: {'bipartite': 1}, 7: {'bipartite': 1}, 8: {'bipartite': 1}, 9: {'bipartite': 1}, 10: {'bipartite': 1}})
So it's actually storing which node is in which part. Let's use that (and a list comprehension)
top = [node for node in B.nodes() if B.node[node]['bipartite']==0]
bottom = [node for node in B.nodes() if B.node[node]['bipartite']==1]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With