I have a directed graph in NetworkX. The edges are weighted from 0 to 1, representing probabilities that they occurred. The network connectivity is quite high, so I want to prune the edges such for every node, only the highest probability node remains.
I'm not sure how to iterate over every node and keep only the highest weighted in_edges
in the graph. Is there a networkx function that allows us to do this?
Here is an example of what I'd like to be able to do.
Nodes:
A, B, C, D
Edges:
A->B, weight=1.0
A->C, weight=1.0
A->D, weight=0.5
B->C, weight=0.9
B->D, weight=0.8
C->D, weight=0.9
Final Result Wanted:
A->B, weight=1.0
A->C, weight=1.0
C->D, weight=0.9
If there are two edges into a node, and they are both of the highest weight, I'd like to keep them both.
Here are some ideas:
import networkx as nx
G = nx.DiGraph()
G.add_edge('A','B', weight=1.0)
G.add_edge('A','C', weight=1.0)
G.add_edge('A','D', weight=0.5)
G.add_edge('B','C', weight=0.9)
G.add_edge('B','D', weight=0.8)
G.add_edge('C','D', weight=0.9)
print "all edges"
print G.edges(data=True)
print "edges >= 0.9"
print [(u,v,d) for (u,v,d) in G.edges(data=True) if d['weight'] >= 0.9]
print "sorted by weight"
print sorted(G.edges(data=True), key=lambda (source,target,data): data['weight'])
The solution I had was inspired by Aric. I used the following code:
for node in G.nodes():
edges = G.in_edges(node, data=True)
if len(edges) > 0: #some nodes have zero edges going into it
min_weight = min([edge[2]['weight'] for edge in edges])
for edge in edges:
if edge[2]['weight'] > min_weight:
G.remove_edge(edge[0], edge[1])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With