I would like to know the optimal number of threads I can run. Normally, this equals to Runtime.getRuntime().availableProcessors()
.
However, the returned number is twice as high on a CPU supporting hyper threading. Now, for some tasks hyper threading is good, but for others it does nothing. In my case, I suspect, it does nothing and so I wish to know whether I have to divide the number returned by Runtime.getRuntime().availableProcessors()
in two.
For that I have to deduce whether the CPU is hyper threading. Hence my question - how can I do it in Java?
Thanks.
EDIT
OK, I have benchmarked my code. Here is my environment:
Long
, which is then stored in a shared hash set.Thus the worker threads do not read anything from the HD, but they do occupy themselves with unzipping and parsing the contents (using the opencsv library).
Below is the code, w/o the boring details:
public void work(File dir) throws IOException, InterruptedException { Set<Long> allCoordinates = Collections.newSetFromMap(new ConcurrentHashMap<Long, Boolean>()); int n = 6; // NO WAITING QUEUE ! ThreadPoolExecutor exec = new ThreadPoolExecutor(n, n, 0L, TimeUnit.MILLISECONDS, new SynchronousQueue<Runnable>()); StopWatch sw1 = new StopWatch(); StopWatch sw2 = new StopWatch(); sw1.start(); sw2.start(); sw2.suspend(); for (WorkItem wi : m_workItems) { for (File file : dir.listFiles(wi.fileNameFilter)) { MyTask task; try { sw2.resume(); // The only reading from the HD occurs here: task = new MyTask(file, m_coordinateCollector, allCoordinates, wi.headerClass, wi.rowClass); sw2.suspend(); } catch (IOException exc) { System.err.println(String.format("Failed to read %s - %s", file.getName(), exc.getMessage())); continue; } boolean retry = true; while (retry) { int count = exec.getActiveCount(); try { // Fails if the maximum of the worker threads was created and all are busy. // This prevents us from loading all the files in memory and getting the OOM exception. exec.submit(task); retry = false; } catch (RejectedExecutionException exc) { // Wait for any worker thread to finish while (exec.getActiveCount() == count) { Thread.sleep(100); } } } } } exec.shutdown(); exec.awaitTermination(1, TimeUnit.HOURS); sw1.stop(); sw2.stop(); System.out.println(String.format("Max concurrent threads = %d", n)); System.out.println(String.format("Total file count = %d", m_stats.getFileCount())); System.out.println(String.format("Total lines = %d", m_stats.getTotalLineCount())); System.out.println(String.format("Total good lines = %d", m_stats.getGoodLineCount())); System.out.println(String.format("Total coordinates = %d", allCoordinates.size())); System.out.println(String.format("Overall elapsed time = %d sec, excluding I/O = %d sec", sw1.getTime() / 1000, (sw1.getTime() - sw2.getTime()) / 1000)); } public class MyTask<H extends CsvFileHeader, R extends CsvFileRow<H>> implements Runnable { private final byte[] m_buffer; private final String m_name; private final CoordinateCollector m_coordinateCollector; private final Set<Long> m_allCoordinates; private final Class<H> m_headerClass; private final Class<R> m_rowClass; public MyTask(File file, CoordinateCollector coordinateCollector, Set<Long> allCoordinates, Class<H> headerClass, Class<R> rowClass) throws IOException { m_coordinateCollector = coordinateCollector; m_allCoordinates = allCoordinates; m_headerClass = headerClass; m_rowClass = rowClass; m_name = file.getName(); m_buffer = Files.toByteArray(file); } @Override public void run() { try { m_coordinateCollector.collect(m_name, m_buffer, m_allCoordinates, m_headerClass, m_rowClass); } catch (IOException e) { e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates. } } }
Please, find below the results (I have slightly changed the output to omit the repeating parts):
Max concurrent threads = 4 Total file count = 84 Total lines = 56395333 Total good lines = 35119231 Total coordinates = 987045 Overall elapsed time = 274 sec, excluding I/O = 266 sec Max concurrent threads = 6 Overall elapsed time = 218 sec, excluding I/O = 209 sec Max concurrent threads = 7 Overall elapsed time = 209 sec, excluding I/O = 199 sec Max concurrent threads = 8 Overall elapsed time = 201 sec, excluding I/O = 192 sec Max concurrent threads = 9 Overall elapsed time = 198 sec, excluding I/O = 186 sec
You are free to draw your own conclusions, but mine is that hyperthreading does improve the performance in my concrete case. Also, having 6 worker threads seems to be the right choice for this task and my machine.
Click the "Performance" tab in the Task Manager. This shows current CPU and memory usage. The Task Manager displays a separate graph for each CPU core on your system. You should see double the number of graphs as you have processor cores if your CPU supports Hyper-Threading.
Hyperthreading is enabled by Java as it uses native threads, thus this is a OS/CPU config. However Hyperthreading does not give you extra cores, it permits timeshare of the four cpus that you have. If you have maxed out the four cpus with four threads, then that is possible with hyperthreading turned on or off.
Hyper-threading is a process by which a CPU divides up its physical cores into virtual cores that are treated as if they are actually physical cores by the operating system. These virtual cores are also called threads [1]. Most of Intel's CPUs with 2 cores use this process to create 4 threads or 4 virtual cores.
It's enabled by default, but it can be switched on and off from the BIOS environment by setting “Hyper-Threading Technology” to “Enable” or “Disable”. Note that Intel® Hyper-Threading Technology is only available on some enthusiast CPUs: see the full list here.
Unfortunately, this is not possible from java. If you know that the app will run on a modern linux variant, you can read the file /proc/cpuinfo and infer if HT is enabled.
Reading the output of this command does the trick:
grep -i "physical id" /proc/cpuinfo | sort -u | wc -l
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With