Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Graph partition algo with Neo4j graph database

I know there has some famous graph partition algo tools like METIS which is implemented by karypis Lab (http://glaros.dtc.umn.edu/gkhome/metis/metis/overview)

but I wanna know is there any method to partition graph stored in Neo4j? or I have to dump the Neo4j's data and transform the node and edge format manually to fit the METIS input format?

like image 428
Arvin Avatar asked Aug 08 '13 04:08

Arvin


1 Answers

Regarding new-ish and interesting algorithms, this is by no means exhaustive or state of the art, but these are the first places I would look:

Specific Algorithm: DiDiC (Distributed Diffusive Clustering) - I used it once in my thesis (Partitioning Graph Databases)

  • You iterate over all nodes, then for each node retrieve all neighbors, in order to spread some of "some unit" to all your neighbors
  • Easy to implement.
  • Can be made deterministic
  • Iterative - as it's based on iterations (like Super Steps in Pregel) you can stop it at any time. The longer you leave it the better the result, in theory (though in some cases, on certain graph shapes it can be unstable)
  • When we implemented this we ran it for 100 iterations on a machine with ~30GB RAM, for up to ~4 million nodes - it took no more than two days to complete.

Specific Algorithm: EvoCut "Finding sparse cuts locally using evolving sets" - local probabilistic algorithm from Microsoft - related to these papers

  • Difficult to implement
  • Local algorithm - BFS-like access patterns (random walks)
  • It's been a while since i read that paper, but i remember it was built on clean abstractions:
    • EvoNibble (pluggable - decides how much of neighborhood to add to the current cluster
    • EvoCut (calls EvoNibble multiple times to find the local cluster)
    • EvoPartition (calls EvoCut repeatedly to partition entire graph)
  • Not deterministic

General Algorithm Family: Hierarchical Graph Clustering

From a high level:

  • Coarsen the graph by collapsing nodes into aggregate nodes
    • coarsening strategy is selectable
  • Find clusters in the coarsened/smaller graph
    • clustering strategy is selectable
  • Incrementally decoarsen the graph, refining at the clustering at each step
    • refining strategy is selectable

Notes:

  • If the graph changes slowly (or results don't need to be right up to date) it may be possible to coarsen once (or infrequently) then work with the coarsened graph - to save computation
  • I don't know of a specific algorithm to recommend

General limitations - the things few clustering algorithms do:

  • Node types not acknowledged - i.e., all nodes treated equally
  • Relationship types not acknowledged - i.e., all relationships treated equally
  • Relationship direction not acknowledged - i.e., relationships treated as undirected
like image 113
Alex Averbuch Avatar answered Sep 21 '22 07:09

Alex Averbuch