Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

KeyError caused by a wrong attribute assignment within the network?

I have a dataset not well formatted as it has the following columns

Source   Target  Label_Source    Label_Target
    E   N   0.0 0.0
    A   B   1.0 1.0
    A   C   1.0 0.0
    A   D   1.0 0.0
    A   N   1.0 0.0
    S   G   0.0 0.0
    S   L   0.0 1.0
    S   C   0.0 0.0

Label_Source and Label_Target are nodes attributes Label_Source is a source's attribute while Label_Target is a target's attribute. Trying to replicate the following project: https://www.fatalerrors.org/a/python-networkx-learning-notes.html, I have encountered some errors, including a KeyError due to Label_Source. As explained in this answer: KeyError after re-running the (same) code, the problem seems caused by a wrong assignment in edge/node attributes, as the code is reading Label_Source as edge's attribute. I would like to replicate the project, as I said, so any format that could make it possible would be acceptable. However, I would really appreciate someone could explain (not only show) how to fix the issue as it is not clear to me what is driving it. What I have done so far is shown below:

import networkx as nx
from matplotlib import pyplot as plt
import pandas as pd

G = nx.from_pandas_edgelist(filtered, 'Source', 'Target',  edge_attr=True)
df_pos = nx.spring_layout(G,k = 0.3) 

nx.draw_networkx(G, df_pos)
plt.show()

node_color = [
    '#1f78b4' if G.nodes[v]["Label_Source"] == 0 # actually this assignment should just Label and it should include also Target, so the whole list of nodes and their labels. A way to address this would be to select all distinct nodes in the network and their labels
    else '#33a02c' for v in G]

# Iterate through all edges
for v, w in G.edges:
    if G.nodes[v]["Label_Source"] == G.nodes[w]["Label_Source"]: # this should refer to all the Labels 
        G.edges[v, w]["internal"] = True
    else:
        G.edges[v, w]["internal"] = False

If you could help me to understand how to fix the issue and replicate the code it would be great. I guess the error is also in trying to iterate through strings and not indices.

like image 449
LdM Avatar asked Apr 09 '21 15:04

LdM


People also ask

What is a keyerror and how to fix it?

The KeyError is raised when you try to access a key in a dictionary that does not exist. For example, consider this simple dictionary: If you try accessing its values by keys then see how it outputs: As such, 1 and 3 keys exist, it displayed the respective values for those keys. The key 4 does not exist and an error generated KeyError: 4.

Why keyerror is raised in Python?

Why KeyError is raised in Python? The KeyError is raised when you try to access a key in a dictionary that does not exist. For example, consider this simple dictionary: If you try accessing its values by keys then see how it outputs: As such, 1 and 3 keys exist, it displayed the respective values for those keys.

What is keyerror hierarchy of keyerror?

Exception hierarchy of KeyError: A Python KeyError is raised when you try to access an invalid key in a dictionary. In simple terms, when you see a KeyError, it denotes that the key you were looking for could not be found. Here, dictionary prices is declared with the prices of three items.

What is network security key mismatch error?

How to Resolve the Possible Network Security Key Mismatch Error? The “ Network security key mismatch ” error message appears after users type in the password to connect to their wireless network. This problem is commonly related to a single wireless network and it’s typically a home network users have set up.


1 Answers

After the creation of your graph:

G = nx.from_pandas_edgelist(filtered, 'Source', 'Target',  edge_attr=True)
df_pos = nx.spring_layout(G,k = 0.3) 

You have the following attributes:

# For edges:
print(G.edges(data=True))
[('E', 'N', {'Label_Source': 0.0, 'Label_Target': 0.0}),
 ('N', 'A', {'Label_Source': 1.0, 'Label_Target': 0.0}),  # Problem here
 ('A', 'B', {'Label_Source': 1.0, 'Label_Target': 1.0}),
 ('A', 'C', {'Label_Source': 1.0, 'Label_Target': 0.0}),
 ('A', 'D', {'Label_Source': 1.0, 'Label_Target': 0.0}),
 ('C', 'S', {'Label_Source': 0.0, 'Label_Target': 0.0}),
 ('S', 'G', {'Label_Source': 0.0, 'Label_Target': 0.0}),
 ('S', 'L', {'Label_Source': 0.0, 'Label_Target': 1.0})]

# For nodes:
print(G.nodes(data=True))
[('E', {}), ('N', {}), ('A', {}), ('B', {}),
 ('C', {}), ('D', {}), ('S', {}), ('G', {}), ('L', {})]

As you can see, nodes have no attribute. You have to copy Label_xxx values from edge attributes to right nodes:

# Don't use it, check update below
for source, target, attribs in G.edges(data=True):
    G.nodes[source]['Label'] = int(attribs['Label_Source'])
    G.nodes[target]['Label'] = int(attribs['Label_Target'])

print(G.nodes(data=True))
[('E', {'Label': 0}), ('N', {'Label': 1}), ('A', {'Label': 1}),
 ('B', {'Label': 1}), ('C', {'Label': 0}), ('D', {'Label': 0}),
 ('S', {'Label': 0}), ('G', {'Label': 0}), ('L', {'Label': 1})]

Now you can set color for each node of your graph:

node_color = ['#1f78b4' if v == 0 else '#33a02c'
              for v in nx.get_node_attributes(G, 'Label').values()]

print(node_color)
['#1f78b4', '#33a02c', '#33a02c', '#33a02c',
 '#1f78b4', '#1f78b4', '#1f78b4', '#1f78b4', '#33a02c']

Final step:

nx.draw_networkx(G, df_pos, label=True, node_color=node_color)
plt.show()

enter image description here

Update

I think there is some problem with the code for assigning the color to nodes. Some nodes have a wrong color (e.g., they should be green and they are blue).

The problem is for the edge ('A', 'N') -> (1, 0) which is stored as ('N', 'A') -> (1, 0) because your graph is not directed so it doesn't matter if the edge is ('A', 'N') or ('N', 'A'). You can solve this problem by creating your graph with the option create_using=nx.DiGraph if that makes sense to your problem.

Another solution is to create the Label attribute not from edge attributes but from your dataframe like my previous answer suggests:

for _, sr in df.iterrows():
    G.nodes[sr['Source']]['Label'] = int(sr['Label_Source'])
    G.nodes[sr['Target']]['Label'] = int(sr['Label_Target'])

print(G.nodes(data=True))
[('E', {'Label': 0}), ('N', {'Label': 0}), ('A', {'Label': 1}),
 ('B', {'Label': 1}), ('C', {'Label': 0}), ('D', {'Label': 0}),
 ('S', {'Label': 0}), ('G', {'Label': 0}), ('L', {'Label': 1})]

Now, you have the right Label attribute for each node:

enter image description here

like image 63
Corralien Avatar answered Sep 24 '22 14:09

Corralien