I am getting the following error message when attempting to write GML file after merging two graphs using compose():
NetworkXError: 'user_id' is not a valid key
The background is that I import two GML files using:
g = nx.read_gml(file_path + "test_graph_1.gml")
h = nx.read_gml(file_path + "test_graph_2.gml")
Each node (in both GML files) file is structured thus:
node [
id 9
user_id "1663413990"
file "wingsscotland.dat"
label "brian_bilston"
image "/Users/ian/development/gtf/gtf/img/1663413990.jpg"
type "friends"
statuses 21085
friends 737
followers 53425
listed 550
ffr 72.4898
lfr 0.1029
shape "triangle-up"
]
After importing each file, I can check all the node attributes, see that nodes are unique within each graph.
I also see that NetworkX by default discards the 'id' field, und uses the 'label' as the identifier of the node. It retains the user_id attribute (which happens to be a Twitter user_id and suits my purposes well).
Running
list(f.nodes(data=True))
I can see that the data for the node above is:
('brian_bilston',
{'ffr': 72.4898,
'file': 'wingsscotland.dat',
'followers': 53425,
'friends': 737,
'image': '/Users/ian/development/gtf/gtf/img/1663413990.jpg',
'lfr': 0.1029,
'listed': 550,
'shape': 'triangle-up',
'statuses': 21085,
'type': 'friends',
'user_id': '1663413990'})
There is (in this test case) one common node shared by Graph g and Graph h, - the one shown above. All others are unique by user_id and label.
I then merge the two graphs using:
f = nx.compose(g,h)
This works ok.
I then go to write out a new GML from the graph, f, using:
nx.write_gml(f, file_path + "one_plus_two.gml")
At this point I get the error, above:
NetworkXError: 'user_id' is not a valid key
I have checked the uniqueness of all user_id's (in case I had duplicated one):
uid = nx.get_node_attributes(f,'user_id')
print(uid)
Which outputs:
{'brian_bilston': '1663413990',
'ICMResearch': '100',
'justcswilliams': '200',
'MissBabington': '300',
'ProBirdRights': '400',
'FredSmith': '247775851',
'JasWatt': '160952087',
'Angela_Lewis': '2316946782',
'Fuzzpig54': '130136162',
'SonnyRussel': '828881340',
'JohnBird': '448476934',
'AngusMcAngus': '19785044'}
(formatted for readability).
So, all user_id's are unique, as far as I can tell.
So, if it is not a question of uniqueness of keys, what is the error telling me?
I've exhausted my thinking on this!
Any pointers, please, would be very much appreciated!
I posted this as an issue on the NextworkX GitHub repo, where it was answered by an admin.
See: https://github.com/networkx/networkx/issues/3100
I have posted his answer below:
Yes -- this is a known issue: see #2131
The GML spec doesn't allow underscores in attribute names. We allow reading .gml files that don't correspond to the official GML spec. But we write only items that follow the spec. You should convert your attribute names to not include the underscore.
for n in G: G.node[n]['userid'] = G.node[n]['user_id'] del G.node[n]['user_id']
We should also add to the documentation a note about this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With