Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practices for Querying graphs by edge and node attributes in NetworkX

Tags:

Using NetworkX, and new to the library, for a social network analysis query. By Query, I mean select/create subgraphs by attributes of both edges nodes where the edges create a path, and nodes contain attributes. The graph is using a MultiDiGraph of the form

G2 = nx.MultiDiGraph() G2.add_node( "UserA", { "type" :"Cat" } ) G2.add_node( "UserB", { "type" :"Dog" } ) G2.add_node( "UserC", { "type" :"Mouse" } ) G2.add_node( "Likes", { "type" :"Feeling" } ) G2.add_node( "Hates", { "type" :"Feeling" } )  G2.add_edge( "UserA", 'Hates' ,  statementid="1" ) G2.add_edge( "Hates", 'UserB' ,  statementid="1"  ) G2.add_edge( "UserC", 'Hates' ,  statementid="2" ) G2.add_edge( "Hates", 'UserA' ,  statementid="2"  ) G2.add_edge( "UserB", 'Hates' ,  statementid="3"  ) G2.add_edge( "Hates", 'UserA' ,  statementid="3"  ) G2.add_edge( "UserC", 'Likes' ,  statementid="3"  ) G2.add_edge( "Likes", 'UserB' ,  statementid="3"  ) 

Queried with

for node,data in G2.nodes_iter(data=True):     if ( data['type'] == "Cat" ):        # get all edges out from these nodes             #then recursively follow using a filter for a specific statement_id  #or get all edges with a specific statement id    # look for  with a node attribute of "cat"  

Is there a better way to query? Or is it best practice to create custom iterations to create subgraphs?

Alternatively (and a separate question), the Graph could be simplified, but I'm not using the below graph because the "hates" type objects will have predcessors. Would this make querying simpler? Seems easier to iterate over nodes

G3 = nx.MultiDiGraph() G3.add_node( "UserA", { "type" :"Cat" } ) G3.add_node( "UserB", { "type" :"Dog" } )  G3.add_edge( "UserA", 'UserB' ,  statementid="1" , label="hates") G3.add_edge( "UserA", 'UserB' ,  statementid="2" , label="hates") 

Other notes:

  • Perhaps add_path adds an identifier to the path created?
  • iGraph has a nice query feature g.vs.select()
like image 284
Jonathan Hendler Avatar asked Mar 26 '13 18:03

Jonathan Hendler


People also ask

Can NetworkX handle large graphs?

NX is certainly capable of handling graphs that large, however, performance will largely be a function of your hardware setup. Aric will likely give a better answer, but NX loads graphs into memory at once, so in the ranges your are describing you will need a substantial amount of free memory for it to work.

Which data type can be used as the content of a node in NetworkX?

In NetworkX, nodes can be any hashable object e.g., a text string, an image, an XML object, another Graph, a customized node object, etc. Python's None object is not allowed to be used as a node.

How can you tell if a graph is directed by NetworkX?

To check if the graph is directed you can use nx. is_directed(G) , you can find the documentation here. 'weight' in G[1][2] # Returns true if an attribute called weight exists in the edge connecting nodes 1 and 2.


1 Answers

It's pretty straightforward to write a one-liner to make a list or generator of nodes with a specific property (generators shown here)

import networkx as nx  G = nx.Graph() G.add_node(1, label='one') G.add_node(2, label='fish') G.add_node(3, label='two') G.add_node(4, label='fish')  # method 1 fish = (n for n in G if G.node[n]['label']=='fish') # method 2 fish2 = (n for n,d in G.nodes(data=True) if d['label']=='fish')  print(list(fish)) print(list(fish2))  G.add_edge(1,2,color='red') G.add_edge(2,3,color='blue')  red = ((u,v) for u,v,d in G.edges(data=True) if d['color']=='red')  print(list(red)) 

If your graph is large and fixed and you want to do fast lookups you could make a "reverse dictionary" of the attributes like this,

labels = {} for n, d in G.nodes(data=True):     l = d['label']     labels[l] = labels.get(l, [])     labels[l].append(n) print labels 
like image 176
Aric Avatar answered Oct 09 '22 20:10

Aric