Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kind of load balanced thread pool in java

I am looking for a load balanced thread pool with no success so far. (Not sure whether load balancing is the correct wording). Let me explain what I try to achieve.

Part 1: I have Jobs, with 8 to 10 single tasks. On a 6 core CPU I let 8 thread work on this tasks in parallel which seems to deliver best peformance. Whe one task is ready, another one can start. Once all ten tasks are finished, the complete job is done. Usually a job is done in 30 to 60 seconds.

Part two: Some times, unfortunately, the job takes more then two hours. This is correct due to amount of data that has to be calculated. The bad thing is, that no other job can start while job1 is running (assuming, that all threads have the same duration) because it is using all threads.

My First idea: Have 12 threads, allow up to three jobs in parallel. BUT: that means, the cou is not fully untilized when there is only 1 job.

I am looking for a solution to have full CPU power for job one when there is no other job. But when an other job needs to be started while one other is running, I want the CPU power allocated to both job. And when a third or fourth job shows up, I want the cpu power alocated fairly to all four jobs.

I apreciate your answers...

thanks in advance

like image 436
Christian Rockrohr Avatar asked Jan 19 '13 14:01

Christian Rockrohr


2 Answers

One possibility might be to use a standard ThreadPoolExecutor with a different kind of task queue

public class TaskRunner {
  private static class PriorityRunnable implements Runnable,
            Comparable<PriorityRunnable> {
    private Runnable theRunnable;
    private int priority = 0;
    public PriorityRunnable(Runnable r, int priority) {
      this.theRunnable = r;
      this.priority = priority;
    }

    public int getPriority() {
      return priority;
    }

    public void run() {
      theRunnable.run();
    }

    public int compareTo(PriorityRunnable that) {
      return this.priority - that.priority;
    }
  }

  private BlockingQueue<Runnable> taskQueue = new PriorityBlockingQueue<Runnable>();

  private ThreadPoolExecutor exec = new ThreadPoolExecutor(8, 8, 0L,
            TimeUnit.MILLISECONDS, taskQueue);

  public void runTasks(Runnable... tasks) {
    int priority = 0;
    Runnable nextTask = taskQueue.peek();
    if(nextTask instanceof PriorityRunnable) {
      priority = ((PriorityRunnable)nextTask).getPriority() + 1;
    }
    for(Runnable t : tasks) {
      exec.execute(new PriorityRunnable(t, priority));
      priority += 100;
    }
  }
}

The idea here is that when you have a new job you call

taskRunner.runTasks(jobTask1, jobTask2, jobTask3);

and it will queue up the tasks in such a way that they interleave nicely with any existing tasks in the queue (if any). Suppose you have one job queued, whose tasks have priority numbers j1t1=3, j1t2=103, and j1t3=203. In the absence of other jobs, these tasks will execute one after the other as quickly as possible. But if you submit another job with three tasks of its own, these will be assigned priority numbers j2t1=4, j2t2=104 and j2t3=204, meaning the queue now looks like

j1t1, j2t1, j1t2, j2t2, etc.

This is not perfect however, because if all threads are currently working (on tasks from job 1) then the first task of job 2 can't start until one of the job 1 tasks is complete (unless there's some external way for you to detect this and interrupt and re-queue some of job 1's tasks). The easiest way to make things more fair would be to break down the longer-running tasks into smaller segments and queue those as separate tasks - you need to get to a point where each individual job involves more tasks than there are threads in the pool, so that some of the tasks will always start off in the queue rather than being assigned directly to threads (if there are idle threads then exec.execute() passes the task straight to a thread without going through the queue at all).

like image 136
Ian Roberts Avatar answered Oct 25 '22 01:10

Ian Roberts


The easiest thing to do is to oversubscribe your CPU, as Kanaga suggests, but start 8 threads each. There may be some overhead from the competition, but if you get to a single job situation, it will fully utilize the CPU. The OS will handle giving time to each thread.

Your "first idea" would also work. The idle threads wouldn't take resources from 8 working threads if they aren't actually executing a task. This wouldn't distribute the cpu resources as evenly when there are multiple jobs running, though.

Do you have a setup where you can test these different pipelines to see how they're performing for you?

like image 28
Joshua Martell Avatar answered Oct 24 '22 23:10

Joshua Martell