JVM (embarrasingly) parallel processing libraries/tools

Question

I am looking for something that will make it easy to run (correctly coded) embarrassingly parallel JVM code on a cluster (so that I can use Clojure + Incanter).

I have used Parallel Python in the past to do this. We have a new PBS cluster and our admin will soon set up IPython nodes that use PBS as the backend. Both of these systems make it almost a no-brainer to run certain types of code in a cluster.

I made the mistake of using Hadoop in the past (Hadoop is just not suited to the kind of data that I use) - the latency made even small runs execute for 1-2 minutes.

Is JPPF or Gridgain better for what I need? Does anyone here have any experience with either? Is there anything else you can recommend?

simon-says · Accepted Answer

Check out cascalog - http://github.com/nathanmarz/cascalog

JVM (embarrasingly) parallel processing libraries/tools

Tags:

java

jvm

parallel-processing

clojure

embarrassingly-parallel

Wynand Winterbach

1 Answers

simon-says

Recent Activity

Donate For Us

JVM (embarrasingly) parallel processing libraries/tools

Tags:

java

jvm

parallel-processing

clojure

embarrassingly-parallel

Wynand Winterbach

1 Answers

simon-says

Related questions

Recent Activity

Donate For Us