I have a dataset not well formatted as it has the following columns
Source Target Label_Source Label_Target
E N 0.0 0.0
A B 1.0 1.0
A C 1.0 0.0
A D 1.0 0.0
A N 1.0 0.0
S G 0.0 0.0
S L 0.0 1.0
S C 0.0 0.0
Label_Source and Label_Target are nodes attributes Label_Source is a source's attribute while Label_Target is a target's attribute. Trying to replicate the following project: https://www.fatalerrors.org/a/python-networkx-learning-notes.html, I have encountered some errors, including a KeyError due to Label_Source. As explained in this answer: KeyError after re-running the (same) code, the problem seems caused by a wrong assignment in edge/node attributes, as the code is reading Label_Source as edge's attribute. I would like to replicate the project, as I said, so any format that could make it possible would be acceptable. However, I would really appreciate someone could explain (not only show) how to fix the issue as it is not clear to me what is driving it. What I have done so far is shown below:
import networkx as nx
from matplotlib import pyplot as plt
import pandas as pd
G = nx.from_pandas_edgelist(filtered, 'Source', 'Target', edge_attr=True)
df_pos = nx.spring_layout(G,k = 0.3)
nx.draw_networkx(G, df_pos)
plt.show()
node_color = [
'#1f78b4' if G.nodes[v]["Label_Source"] == 0 # actually this assignment should just Label and it should include also Target, so the whole list of nodes and their labels. A way to address this would be to select all distinct nodes in the network and their labels
else '#33a02c' for v in G]
# Iterate through all edges
for v, w in G.edges:
if G.nodes[v]["Label_Source"] == G.nodes[w]["Label_Source"]: # this should refer to all the Labels
G.edges[v, w]["internal"] = True
else:
G.edges[v, w]["internal"] = False
If you could help me to understand how to fix the issue and replicate the code it would be great. I guess the error is also in trying to iterate through strings and not indices.
The KeyError is raised when you try to access a key in a dictionary that does not exist. For example, consider this simple dictionary: If you try accessing its values by keys then see how it outputs: As such, 1 and 3 keys exist, it displayed the respective values for those keys. The key 4 does not exist and an error generated KeyError: 4.
Why KeyError is raised in Python? The KeyError is raised when you try to access a key in a dictionary that does not exist. For example, consider this simple dictionary: If you try accessing its values by keys then see how it outputs: As such, 1 and 3 keys exist, it displayed the respective values for those keys.
Exception hierarchy of KeyError: A Python KeyError is raised when you try to access an invalid key in a dictionary. In simple terms, when you see a KeyError, it denotes that the key you were looking for could not be found. Here, dictionary prices is declared with the prices of three items.
How to Resolve the Possible Network Security Key Mismatch Error? The “ Network security key mismatch ” error message appears after users type in the password to connect to their wireless network. This problem is commonly related to a single wireless network and it’s typically a home network users have set up.
After the creation of your graph:
G = nx.from_pandas_edgelist(filtered, 'Source', 'Target', edge_attr=True)
df_pos = nx.spring_layout(G,k = 0.3)
You have the following attributes:
# For edges:
print(G.edges(data=True))
[('E', 'N', {'Label_Source': 0.0, 'Label_Target': 0.0}),
('N', 'A', {'Label_Source': 1.0, 'Label_Target': 0.0}), # Problem here
('A', 'B', {'Label_Source': 1.0, 'Label_Target': 1.0}),
('A', 'C', {'Label_Source': 1.0, 'Label_Target': 0.0}),
('A', 'D', {'Label_Source': 1.0, 'Label_Target': 0.0}),
('C', 'S', {'Label_Source': 0.0, 'Label_Target': 0.0}),
('S', 'G', {'Label_Source': 0.0, 'Label_Target': 0.0}),
('S', 'L', {'Label_Source': 0.0, 'Label_Target': 1.0})]
# For nodes:
print(G.nodes(data=True))
[('E', {}), ('N', {}), ('A', {}), ('B', {}),
('C', {}), ('D', {}), ('S', {}), ('G', {}), ('L', {})]
As you can see, nodes have no attribute. You have to copy Label_xxx
values from edge attributes to right nodes:
# Don't use it, check update below
for source, target, attribs in G.edges(data=True):
G.nodes[source]['Label'] = int(attribs['Label_Source'])
G.nodes[target]['Label'] = int(attribs['Label_Target'])
print(G.nodes(data=True))
[('E', {'Label': 0}), ('N', {'Label': 1}), ('A', {'Label': 1}),
('B', {'Label': 1}), ('C', {'Label': 0}), ('D', {'Label': 0}),
('S', {'Label': 0}), ('G', {'Label': 0}), ('L', {'Label': 1})]
Now you can set color for each node of your graph:
node_color = ['#1f78b4' if v == 0 else '#33a02c'
for v in nx.get_node_attributes(G, 'Label').values()]
print(node_color)
['#1f78b4', '#33a02c', '#33a02c', '#33a02c',
'#1f78b4', '#1f78b4', '#1f78b4', '#1f78b4', '#33a02c']
Final step:
nx.draw_networkx(G, df_pos, label=True, node_color=node_color)
plt.show()
Update
I think there is some problem with the code for assigning the color to nodes. Some nodes have a wrong color (e.g., they should be green and they are blue).
The problem is for the edge ('A', 'N') -> (1, 0)
which is stored as ('N', 'A') -> (1, 0)
because your graph is not directed so it doesn't matter if the edge is ('A', 'N')
or ('N', 'A')
. You can solve this problem by creating your graph with the option create_using=nx.DiGraph
if that makes sense to your problem.
Another solution is to create the Label
attribute not from edge attributes but from your dataframe like my previous answer suggests:
for _, sr in df.iterrows():
G.nodes[sr['Source']]['Label'] = int(sr['Label_Source'])
G.nodes[sr['Target']]['Label'] = int(sr['Label_Target'])
print(G.nodes(data=True))
[('E', {'Label': 0}), ('N', {'Label': 0}), ('A', {'Label': 1}),
('B', {'Label': 1}), ('C', {'Label': 0}), ('D', {'Label': 0}),
('S', {'Label': 0}), ('G', {'Label': 0}), ('L', {'Label': 1})]
Now, you have the right Label
attribute for each node:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With