Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java library for storing and processing large (up to 600k vertices) graphs

I'm working on a project which will involve running algorithms on large graphs. The largest two have around 300k and 600k vertices (fairly sparse I think). I'm hoping to find a java library that can handle graphs that large, and also trees of a somewhat smaller size, as one of the algorithms I'll be using involves decomposing a graph into a tree. Ideally the library would also include breadth first search and Dijkstra's or other shortest-path algorithms.

Based on another question, I've been looking at a few libraries (JGraphT, JUNG, jdsl, yworks) but I'm having a hard time finding out how many vertices they can realistically handle. Looking at their documentation, all I could find was a bit in the JUNG FAQ that said it could easily handle graphs of upwards of 150k vertices, which is still quite a bit smaller than my graphs... I'm hoping someone here has used one or more of these libraries and can tell me if it'll handle the graph sizes I need, or if there's some other library that would be better.

For the record I don't need any visualization tools; this is strictly about representing the graphs and trees in data structures and running algorithms on them.

Background if anyone really cares: for a class I'm supposed to implement an algorithm described in a research paper, and run the experiments run in the paper as best I can. The paper and datasets I'll be using can be found here. My professor says I can use any library I can find as long as I can tell what the time/space complexity of the algorithms/data structures are.

like image 837
Maltiriel Avatar asked Mar 29 '12 17:03

Maltiriel


3 Answers

You should take a look at Neo4J which is a graphical database which might be a good solution for your problems.

like image 132
khmarbaise Avatar answered Oct 27 '22 06:10

khmarbaise


Checkout JGraph as well. However it is oriented towards visualization.

Also, maybe Apache Hama - a distributed computing framework for massive scientific computations e.g., matrix, graph and network algorithms.

Annas may also interest you - open-source Java framework that was built for developers and researchers in the fields of Graph Theory - AI, Path finding, distributed systems, etc.

like image 24
tenorsax Avatar answered Oct 27 '22 07:10

tenorsax


Cassovary https://github.com/twitter/cassovary -project from Twitter can handle very big graphs with Scala (thus JVM) in memory.

Alternatively, GraphChi's Java version can handle even bigger graphs, by using disk: http://code.google.com/p/graphchi-java/

However, GraphChi will not be efficient for exact shortest-path type algorithms, as they require fast random access.

like image 44
Aapo Kyrola Avatar answered Oct 27 '22 06:10

Aapo Kyrola