Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what are the main differences between TORQUE, HTCondor and Apache Mesos

  • http://www.adaptivecomputing.com/products/open-source/torque/
  • https://research.cs.wisc.edu/htcondor/

I am looking for a program to perform distributed computing (no parallel computing needed though) which has:

  • a scheduler
  • a queue management (FIFO, or preferably something more advanced)
  • a good statistics report
  • ability to run on a heterogeneous cluster (a set of machines with different characteristics such as cpu and memory)
  • and very important: a good responsivness (a few seconds maximum between the trigger of the task and the actual start of the execution: I have heard that this may be tricky to achieve with HTCondor and TORQUE? What about Apache Mesos?)
like image 702
RockScience Avatar asked Nov 09 '22 22:11

RockScience


1 Answers

There is a quite large wikipedia page with comparisons, but you will hardly find large differences. My guess would be that most things could theoretically be done in either framework. The things you list all depend on perspective (people e.g. commonly write their own sophisticated statistics from HTCondor logs). Regarding responsiveness: HTCondor works fine to schedule interactive notebooks if there are enough ressources for the workers to pick up the job. Few seconds is often no problem, but there are hardly guarantees. These are High Throughput Systems, but not low-latency systems. You should preallocate workers and scale them up and down if you care for latency (here supports for other frameworks on top helps much more than native latency).

I try my best to highlight the main foci of each Project from my perspective, that are important for a practical decision:

Target audience

Mesos:

  • PaaS/IaaS targeted to run other schedulers (you can run Torque on top of Mesos)
  • particularly interop with big data frameworks such as Spark & Kafka

vs.

Both HTCondor & Torque:

  • fair-share batch processing particularly in scientific clusters (High Throughput Computing)

Eco-system

Mesos:

  • Apache open source project with community

vs.

HTCondor:

  • Open Source maintained by UW-Madison with classical user mailing-list

vs.

TORQUE:

  • Proprietary, Commercial support

Ease of use

(partially this is statistics, but more the dashboard style)

Mesos & TORQUE:

  • Web UI
  • commonly integrations with other frameworks available (for TORQUE look for PBS)

HTCondor:

  • new, developing REST and python interaces but no common GUI
  • lagging behind a tiny bit in framework support (R batchtools, lately is has had dask support)
like image 157
till Avatar answered Dec 21 '22 22:12

till