Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NetworkX - generating a random connected bipartite graph

I'm using NetworkX to generate a bipartite graph using either nx.bipartite.random_graph or nx.bipartite.gnmk_random_graph, as follows:

B = bipartite.gnmk_random_graph(5,6,10)
bottom_nodes, top_nodes = bipartite.sets(B)

However, I get an error:

networkx.exception.AmbiguousSolution: Disconnected graph: Ambiguous solution for bipartite sets.

It is just a single line, so I'm not sure how I could be doing this wrong and why their package would be returning (what I assume is) an invalid bipartite graph.

Thanks.

EDIT: I just realised that I need to specify a minimum number of edges/probability for the third argument.

E.g. bipartite.random_graph(5,6,0.6) and having p>0.5 gets rid of the error. Similarly, bipartite.gnmk_random_graph(5,6,11) where k>n+m. I didn't realise this was the case, as I assumed if the number of edges was lower than required to connect every vertex there would just be some floating vertices.

Thanks for your help!

like image 615
monadoboi Avatar asked Oct 16 '22 06:10

monadoboi


1 Answers

SHORT ANSWER

You want to do

B = bipartite.gnmk_random_graph(5,6,10)
top = [node for node in B.nodes() if B.node[node]['bipartite']==0]
bottom = [node for node in B.nodes() if B.node[node]['bipartite']==1]

Explanation

So when you generate this bipartite graph, it is likely to be disconnected. Let's say it has 2 separate components X and Y. Both of these components are bipartite.

bipartite.sets(B) is supposed to determine which sets are the two partitions of B. But it's going to run into trouble.

Why?

X can be broken into two partitions X_1 and X_2 and Y can be broken into Y_1 and Y_2. What about B? Let top = X_1 + Y_1 and bottom = X_2 + Y_2. This is a perfectly legitimate partition. But top = X_1+Y_2 and bottom = X_2+Y_1 is also a perfectly legitimate partition. Which one should it return? It's ambiguous. The algorithm explicitly refuses to make a choice. Instead it gives you an error.

What to do? You could throw out B if it's disconnected and try again. But you're using B for something right? Is it reasonable to restrict your attention only to the connected graphs? Maybe, maybe not. That's something you need to figure out. But it is not reasonable to restrict your attention only to the connected graphs if the reason is that disconnected graphs are inconvenient. You seem to hit this error more often than not, so a large fraction of the graphs are disconnected --- you're throwing out a large fraction of the cases. It seems that this is likely to bias the final outcome of whatever you're doing. (similarly, if you take steps to connect your network, you're no longer getting random graphs from the original distribution because well, you've ensured they aren't disconnected, and even worse now - you may not be uniformly sampling from the connected graphs).

So what's a better option? After looking at the source code, I found that this method isn't documented as well as it should be. It turns out that that for

B = bipartite.gnmk_random_graph(5,6,10)

nodes 0 up to 4 (the first five) are in the top, and nodes 5 up to 10 (the next 6) are in the bottom.

Alternately you can get it directly from the data that is encoded in the graph B (not mentioned in the documentation). Try

B = bipartite.gnmk_random_graph(5,6,10)
B.nodes(data=True)
> NodeDataView({0: {'bipartite': 0}, 1: {'bipartite': 0}, 2: {'bipartite': 0}, 3: {'bipartite': 0}, 4: {'bipartite': 0}, 5: {'bipartite': 1}, 6: {'bipartite': 1}, 7: {'bipartite': 1}, 8: {'bipartite': 1}, 9: {'bipartite': 1}, 10: {'bipartite': 1}})

So it's actually storing which node is in which part. Let's use that (and a list comprehension)

top = [node for node in B.nodes() if B.node[node]['bipartite']==0]
bottom = [node for node in B.nodes() if B.node[node]['bipartite']==1]
like image 137
Joel Avatar answered Nov 03 '22 00:11

Joel