Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Any Open Source Pregel like framework for distributed processing of large Graphs?

Tags:

Google has described a novel framework for distributed processing on Massive Graphs.

http://portal.acm.org/citation.cfm?id=1582716.1582723

I wanted to know if similar to Hadoop (Map-Reduce) are there any open source implementations of this framework?

I am actually in process of writing a Pseudo distributed one using python and multiprocessing module and thus wanted to know if someone else has also tried implementing it. Since public information about this framework is extremely scarce. (A link above and a blog post at Google Research)

like image 961
Akshay Bhat Avatar asked Jun 06 '10 21:06

Akshay Bhat


People also ask

Is Pregel open source?

Pregel+ is not just another open-source Pregel implementation, but a substantially improved distributed graph computing system with effective message reduction. Compared with existing Pregel-like systems, Pregel+ provides simpler programming interface and yet achieves higher computational efficiency.

How does Pregel process a graph?

A Pregel computation takes a graph and a corresponding set of vertex states as its inputs. At each iteration, referred to as a superstep, each vertex can send a message to its neighbors, process messages it received in a previous superstep, and update its state.

What is graph processing system?

Definitions. A graph processing framework (GPF) is a set of tools oriented to process graphs. Graph vertices are used to model data and edges model relationships between vertices.

Is GPS a graph?

GPS is a distributed system designed to run on a cluster of machines, such as Amazon's EC2. In systems such as GPS and Pregel, the input graph (directed, possibly with values on edges) is distributed across machines and vertices send each other messages to perform a computation.


2 Answers

  • Apache Giraph http://giraph.apache.org
  • Phoebus https://github.com/xslogic/phoebus
  • Bagel https://github.com/mesos/spark/pull/48
  • Hama http://hama.apache.org/
  • Signal-Collect http://code.google.com/p/signal-collect/
  • HipG http://www.cs.vu.nl/~ekr/hipg/
like image 114
Yury Litvinov Avatar answered Oct 25 '22 02:10

Yury Litvinov


The main Hadoop project for distributed graph processing is the Hama project. Its still in incubation though.

The project has broken its work into two areas; a matrix package and a graph package.

Update:

A better option would be the Apache Giraph project which is based on Google Pregel.

like image 30
Binary Nerd Avatar answered Oct 25 '22 02:10

Binary Nerd