C/C++ Framework for distributed computing in a dynamic cluster

Question

I am looking for a framework to be used in a C++ distributed number crunching application.

The setup looks as follows:

There is a master node which divides the problem domain into small independent tasks. The tasks are distibuted to worker nodes of different capability (e.g. CPU type/GPU-enabled). Worker nodes are dynamically added to the compute grid, as they become available. It may also happen that a worker node dies, without saying good bye.

I am searching for a fast C/C++ framework to accomplish this setup.

To summarize, my main requirements are:

Worker/Task-scheduling paradigm
Dynamically add/remove nodes
Target network: 1G - 10G ethernet (corporate network, good performance over internet not required)
Optional: Encrypted and authenticated communication

High Performance Mark · Accepted Answer

You can certainly do what you want with MPI. MPI-2 added dynamic process management features, and I think most of the currently widely-used implementations offer these.

One of the advantages of using C++ + MPI is that the combination is quite widely used in scientific and technical computing, though my impression is that within this niche dynamic process management is not used very much. Since MPI is used on the very largest supercomputers tackling the bleeding-edge problems of computational science, one might hazard a guess that it would be fast enough for your purposes.

One of the disadvantages of using C++ + MPI is that MPI was not designed to tolerate failure of processes during execution. There is debate on SO about whether or not the dynamic process management features allow you to program your own fault tolerance. But no debate that it might be difficult.

You would get the first 3 of your requirements 'out-of-the-box'. As for encrypted and authenticated communication, you'd have to do most of that yourself, MPI just passes messages around. I'd guess that for most MPI users, running parallel applications on clusters or supercomputers with private interconnects (often themselves isolated from corporate or enterprise networks), encryption and authentication are matters of little concern.

C/C++ Framework for distributed computing in a dynamic cluster

Tags:

c++

c

scheduled-tasks

distributed-computing

hpc

Erik

1 Answers

High Performance Mark

Recent Activity

Donate For Us

C/C++ Framework for distributed computing in a dynamic cluster

Tags:

c++

c

scheduled-tasks

distributed-computing

hpc

Erik

1 Answers

High Performance Mark

Related questions

Recent Activity

Donate For Us