Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Distribute 10 infinite jobs over 4 threads in a load-balanced manner (Java)

I have 10 calculation jobs that take (near) infinite time. For example: calculate the next digit of PI, solve an NP-hard constraint satisfaction problem, etc.

I have 4 threads (so a thread pool with 4 threads on a machine with 8 cores, so I have some cores left to avoid live-locking the machine and the process).

Using Java 8, how do distribute these 10 jobs across those 4 threads?

This is a bad idea:

ExecutorService es = Executors.newFixedThreadPool(4);
for (Job j : jobs) {
    es.submit(j);
}

because 4 jobs will start, but none will finish so job 5-10 will never start.

If I look after for example 10 minutes, I would expect that each job has run for about 4 minutes. After 20 minutes, each job has run for about 8 minutes, etc. What are the typical patterns to deal with this? (If needed, I can implement a way to pauze a calculation after a preset amount of time.)

like image 770
Geoffrey De Smet Avatar asked Aug 26 '16 09:08

Geoffrey De Smet


4 Answers

The task of distributing ten jobs between four threads and the task of utilizing only four CPUs (I use CPU here as a synonym of core for simplicity) by your ten jobs are a bit different.

Four threads

Limiting the threads number to four will not guarantee that they will stick to four CPUs and won't use the others. OS is allowed to shuffle your threads between all available CPUs as it likes. The only thing you can guarantee, is that your program won't be able to utilize more than 50% of all the CPU resources (given the fact that you have eight CPUs).

But it's unlikely that you'll manage to utilize those 50%. Despite the fact that your jobs are primarily CPU-oriented, chances are that they still need read from and write to memory from time to time. When a thread misses cache on such readings/writings and waits for data to be delivered to the processor, this processor put the thread on hold and can do some work in another thread. In your case, it will have nothing to do and just sit idle until the data arrives. So, it's likely that your CPUs will be underutilized.

If you decide to go with this approach, you need to break your jobs into small tasks and feed them to the executors, as @James Large said. You can either use WorkStealingPool with four threads (as @Alexey Soshin proposed), or create a pool with ten threads and use a Semaphore with four permits and fairness set to true. In the latter case, your threads must use loops, acquire permits at the beginning of every iteration and release them at the end. Each iteration will represent a small chunk of work.

Four CPUs

There're mechanisms to designate particular CPUs to work on your tasks.

On the process level in Linux you can use special commands to bind your process to particular CPUs. That will allow you to create ten threads and let the OS do all the balancing on four CPUs.

On the threads level, you can try Java Affinity library from OpenHFT. It allows to bind threads to CPUs right in your Java code. The problem is that ten threads cannot be devided between four CPUs without reminder, so it will be hard to balance them.

like image 118
Andrew Lygin Avatar answered Nov 13 '22 10:11

Andrew Lygin


I think you're looking for WorkStealingPool:

static ExecutorService executor = Executors.newWorkStealingPool(4);
private static Map<Integer, AtomicInteger> map = new ConcurrentHashMap<>();

public static void main(String[] args) throws InterruptedException {


    for (int i = 0; i < 10; i++) {
        executor.submit(new Worker(i))  ;
    }

    Thread.sleep(10000);
    System.out.println(map);
}

private static class Worker implements Runnable {
    private final int k;

    public Worker(int k) {
        this.k = k;
    }

    @Override
    public void run() {
        map.putIfAbsent(k, new AtomicInteger(0));
        map.get(k).getAndIncrement();
        executor.submit(new Worker(this.k));

       // Also possible to resubmit current job
       //executor.submit(this);
    }
}
like image 40
Alexey Soshin Avatar answered Nov 13 '22 11:11

Alexey Soshin


If you need to execute 10 jobs in parallel - simply run 10 threads.

Change Executors.newFixedThreadPool(4) to Executors.newFixedThreadPool(10).

like image 1
Anton Malyshev Avatar answered Nov 13 '22 09:11

Anton Malyshev


I am somewhat bothered by the idea of a "Job that never finishes." I would call it something else, like "long-running computation" or,...

If you've got ten of them, and you can only afford four threads to work on them, then your only choice is to break them into finite "sub-jobs" that do finish, and then write a scheduler that keeps feeding sub-jobs to the four available threads.

But that would be replicating most of what the threading system is supposed to do for you.

I would just make ten threads. If you're running on a machine that only has four available cores to run the ten threads, the OS will automatically break your long running jobs into "sub-jobs" (i.e., time slices), and schedule them fairly on the four cores.

like image 1
Solomon Slow Avatar answered Nov 13 '22 11:11

Solomon Slow