Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

KeyError when writing NumPy values to GEXF with NetworkX

Hi everyone I 'd like to compute node coordinates and then export graph to GEXF and process it with Gephi. However when I run the following code

import networkx as nx
import numpy as np
....
area_ratios = [np.sum(new[:,0])/Stotal, np.sum(new[:,1])/Stotal, np.sum(new[:,2])/Stotal]
X = np.array([0, -sqrt(3)/2 * area_ratios[1] , sqrt(3)/2 * area_ratios[2]])
Y = np.array([ area_ratios[0], -1/2 * area_ratios[1] , -1/2 * area_ratios[2]])

point = (np.sum(X), np.sum(Y))

graph.add_node(node_name, {'x-coord': np.asscalar(point[0]*SCALE_FACTOR),         
          'y-coord': np.asscalar(point[1]*SCALE_FACTOR), 'size': Stotal*3})

nx.write_gexf(graph, PATH + 'mygraph.gexf')

it gives me a KeyError: <type 'numpy.float64'> even though np.asscalar is meant to convert the relevant attributes to the compatible python type.

Any ideas?

like image 534
Yannis P. Avatar asked Feb 26 '14 09:02

Yannis P.


1 Answers

Looks like this was solved a long time ago but I found that my code was having a similar problem using float values from a pandas data frame. The solution was in the comments but it took me a while to figure it out so I thought I might clarify.

If you are making your nodes from a dataframe like this:

G.add_node(df2.loc[row,door_col],
                attr_dict={'dropoff':df2.loc[row,'A'],
                            'pageLoadTime':df2.loc[row,'B'],
                            'pageviews':df2.loc[row,'C'],
                            'sessions':df2.loc[row,'D'],
                            'entrances':df2.loc[row,'E'],
                            'exits':df2.loc[row,'F'],
                            'timeOnPage':df2.loc[row,'G'],
                            'classesB':df2.loc[row,'H']}) 

Assuming cols a-g are floats, they are np.float64 values, not float values. nx.write_gexf() will crash. However the easy fix is to coerce them into simple values using something like this:

G.add_node(df2.loc[row,door_col],
                attr_dict={'dropoff':float(df2.loc[row,'A']),
                            'pageLoadTime':float(df2.loc[row,'B']),
                            'pageviews':float(df2.loc[row,'C']),
                            'sessions':float(df2.loc[row,'D']),
                            'entrances':float(df2.loc[row,'E']),
                            'exits':float(df2.loc[row,'F']),
                            'timeOnPage':float(df2.loc[row,'G']),
                            'classesB':str(df2.loc[row,'H'])}) 

There are a lot of tools that struggle with np.float64 types. Converting them is always the easy option.

like image 167
billmanH Avatar answered Oct 17 '22 20:10

billmanH