My mac is armed with 16 cores.
System.out.println(Runtime.getRuntime().availableProcessors()); //16
I'm running the code below to see the effectiveness of utilizing my cores. The thread 'CountFileLineThread' simply count the number of lines in a file (There are 133 files in a folder)
I'm taking notes on this line:
ExecutorService es = Executors.newFixedThreadPool(NUM_CORES);
Where NUM_CORES is between 1 to 16.
You will note from the result below that above 5 cores the performance starts to degrade. I wouldn't expect a 'product of diminishing return' for 6 cores and above (btw, for 7 cores it takes over 22 minutes, hello?!?!) my question is why?
public class TestCores
{
public static void main(String args[]) throws Exception
{
long start = System.currentTimeMillis();
System.out.println("START");
int NUM_CORES = 1;
List<File> files = Util.getFiles("/Users/adhg/Desktop/DEST/");
System.out.println("total files: "+files.size());
ExecutorService es = Executors.newFixedThreadPool(NUM_CORES);
List<Future<Integer>> futures = new ArrayList<Future<Integer>>();
for (File file : files)
{
Future<Integer> future = es.submit(new CountFileLineThread(file));
futures.add(future);
}
Integer total = 0;
for (Future<Integer> future : futures)
{
Integer result = future.get();
total+=result;
System.out.println("result :"+result);
}
System.out.println("----->"+total);
long end = System.currentTimeMillis();
System.out.println("END. "+(end-start)/1000.0);
}
}
The more cores there are, the faster the tasks are carried out. A computer with one processor core can carry out a single task at a time. Although it may perform the task very fast it must finish before it can do something else.
CPU cores have to communicate with each other through channels and this uses up some of the extra speed. Therefore, if we increase the number of cores in a processor, there will be an increase in system performance.
Generally - yes. Ignore the coding part for a moment. Modern multi core processors have a boost mode if only a small number of cores are used that will boost frequency a little. As such, using all cores makes the individual core smaller.
A faster CPU speed typically helps you to load applications faster, while having more cores allows you to have more programs running at the same time and to switch from one program to the other with more ease.
I added this as a comment, but I'm going to throw it in there as answer too. Because your test is doing file I/O, you have probably hit a point with that 6th thread where you are now doing too much I/O and thus slowing everything down. If you really want to see the benefit of the 16 cores you have, you should re-write your file reading thread to use non-blocking I/O.
My hunch is that you may have put so much burden on the disk I/O that you slowed everything down! See the I/O performance in "Activity Monitor" (if you are on OSX). On Linux, use vmstat
command to get an idea of what is going on. [If you see lots of swapping or high rate of reads/s and writes/s then there you go]
Few things I noticed:
CountFileLineThread
is not in the code. Please put it so we can see exactly what's going on.
Next,
for (Future<Integer> future : futures)
{
Integer result = future.get();
total+=result;
System.out.println("result :"+result);
}
Here, note that you are blocked on on the result of the first Task
(future.get()
). Meanwhile the other results may have already been available but you can't see them until the first completes. Use CompletionService
instead to get the results in the order they finish for better measurement. It doesn't matter though since you want all Threads to be done before ending the timer though.
Another point: Blocking I/O is the key. It doesn't matter, per se, how many cores you have if the tasks are blocked on I/O, Network, etc. Modern Processors have what's what Hyper Threading and they can run a thread waiting to be run if currently executing thread blocks.
So for example, if I have 16 cores and I spawn 16 Threads asking them to read 1 GB files, I will not get any performance improvements just by having more cores. The bottleneck is the disk and memory.
Adding processors causes all sorts of problems, but mostly they have to do with synchronization between processors. Task-level locking within the file system, etc, can become a problem, but even more of a problem is the synchronization between cores that must occur just to maintain cache coherence, keep track of changed pages, etc. I don't know how many cores per chip you have (gave up tracking that stuff about 10 years ago), but generally once you begin synchronizing off-chip performance goes down the tubes.
I'll add that the JVM can make a major difference here. Careful JVM design is required to minimize the number of shared (and frequently updated) cache lines, and incredible effort is required to make GC work efficiently in a multi-core environment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With