Are there any good language-agnostic distributed make systems for linux that are secure and free?
Background Information:
I run scientific experiments (computer-science ones) that sometimes have large dependency trees, occasionally on the order of thousands or tens of thousands of tree nodes. This dependency tree is over the data files, data processing executables, and results files.
I've experimented with various techniques over the years including:
I'm basically looking for something like distmake, but more secure. As far as I can tell, distmake essentially leaves a wide-open backdoor into each worker node.
It would also be nice if a replacement were more robust than distmake. If you break out of the main distmake call, it can shut down the backdoor servers, but it doesn't properly kill the executing processes on the worker nodes.
Clarifications:
I am processing data with the makefile, not compiling and linking with gcc. From what I read in the documentation, distcc is a specialized tool for distributing gcc. I'll be running my own executables on very large data files hosted on a shared filesystem, not gcc on source files, so distcc isn't helpful.
The worker nodes are externally-visible machines, so I want any worker daemons to be at least as secure as ssh. As best I can tell without reading the source, distmake worker daemons open up a port and will accept commands from anyone who attaches to it. They will execute the commands as the user who started the daemon.
Dependencies are hard to manage, and I don't know of any perfect system that does what you want without a significant amount of work.
The closest thing that I've used is the following setup: - a Condor queue to manage the machines in your cluster - the Condor DAGMAN meta-scheduler to submit jobs that are interdependent. DAGMAN is an acronym for Directed Acyclic Graph MANager, in which a directed acyclic graph is used to represent the dependencies between your jobs.
We've done this for an iterative scientific protocol in our lab very successfully and it's worked great, although it was a learning experience for a very talented postdoc to get the initial implementation running. It does require that you set up and run a Condor cluster which is non-trivial, but I assume you have either Condor or something similar to manage all of your machines. It might be that Sun GridEngine has something analogous that I don't know about.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With