Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to persist community information in a graph

I have some graph databases (friends networks, purchasing history, etc.) that I persist with Neo4j. I plan to analyze these with community detection algorithms such as Girvan Newman. These algorithms usually return a dendrogram, representing the division of the graph from whole network to individual nodes. I am wondering how I might persist these results. I suppose it could be stored as a separate graph, but is there a way to store it within the graph itself? My concern in doing so is the need for creating nodes to represent the groups, which is something I would like to avoid.

like image 600
Paul Jackson Avatar asked Feb 23 '23 05:02

Paul Jackson


2 Answers

One way to represent a dendrogram is as a list of pairs, containing (n-1) pairs for n elements. Assuming the left element of the pair is the one whose ID is kept to refer to all elements in a community, a sample dendrogram might look like

[[0,1],[2,3],[0,2]]

So an alternative way to persist that might be to store at each node at which time step it is merged into another node (together with all the nodes that have been previously merged into it).

So you'd attach (0:0) to 1, (1:2) to 3 and (2:0) to 2 (timestep:new 'name' of node).

edit: Concretely, this might mean attaching two integer-valued attributes e.g. 'merge_timestep' and 'merge_into' to each Neo4J node object.

like image 123
Nicolas78 Avatar answered Feb 24 '23 18:02

Nicolas78


Most community detection algorithms work by agglomerating communities along existing edges in the graph; Girvan-Newman is a little unusual in that it works by cutting edges. Either way, the dendrogram can be viewed as showing an ordering of operations on the edges of the graph. Thus, instead of storing the dendrogram as a separate object, you can attach properties to the edges (relationships) showing in which order they should be merged/cut. My knowledge of Neo4j is extremely limited, so I'll leave the details to you.

There are some complications with merging, as there will generally be multiple equivalent edges, each linking different vertices within the communities to merge. Basically, just pick a strategy that lets you figure out the linked communities from the edges.

like image 35
Michael J. Barber Avatar answered Feb 24 '23 17:02

Michael J. Barber