Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

8 logical threads at 4 cores will at a maximum run 4 times faster in parallel?

I'm benchmarking software which executes 4x faster on Intel 2670QM then my serial version using all 8 of my 'logical' threads. I would like some community feedback on my perception of the benchmarking results.

When I am using 4 Threads on 4 cores I get a speed up of 4x, the entire algorithm is executed in parallell. This seems logical to me since 'Amdhals law' predicts it. Windows task manager tells me I'm using 50% of the CPU.

However if I execute the same software on all 8 threads, I get, once again a speed up of 4x and not a speed up of 8x.

If I have understood this correctly: my CPU has 4 cores with a Frequency of 2.2GHZ individually but the Frequency is divided into 1.1GHZ when applied to 8 'logical' threads and the same follows for the rest of the component such as the cache memory? If this is true then why does the task manager claim only 50% of my CPU is being used?

#define NumberOfFiles 8
...
char startLetter ='a';
#pragma omp parallel for shared(startLetter)
for(int f=0; f<NumberOfFiles; f++){
    ...
}

I am not including the time using disk I/O. I am only interested in the time a STL call takes(STL sort) not the disk I/O.

like image 876
Cisum Inas Avatar asked May 01 '12 19:05

Cisum Inas


People also ask

What does 4 cores and 8 logical processors mean?

The logical processor (logical core or CPU) is how many of those cores are divided using hyperthreading to allow multiple instructions (threads) to be processed on each core simultaneously. For example, your processor may have four physical cores that are divided into eight logical processors using hyperthreading.

What is meant by 4 cores 8 threads?

In my experience, 4 cores means you can do 4 things at the same time with impunity. 8 threads just means that two threads are sharing one core (assuming they are evenly distributed), so unless your code has some parallelism built in, you may not see any speed improvement above threads == cores .

How many threads can I run on 8 cores?

A single CPU core can have up-to 2 threads per core. For example, if a CPU is dual core (i.e., 2 cores) it will have 4 threads. And if a CPU is Octal core (i.e., 8 core) it will have 16 threads and vice-versa.

How many threads can a 4 core CPU run?

If your CPU has four cores, it can handle four threads at once. If your CPU has six cores…. you get the idea. Hyper-Threading changes this dynamic in a pretty significant way because it allows a single core to serve multiple threads at once through virtualisation.


2 Answers

A i7-2670QM processor has 4 cores. But it can run 8 threads in parallel. This means that it only has 4 processing units (Cores) but has support in hardware to run 8 threads in parallel. This means that a maximum of four jobs run in on the Cores, if one of the jobs stall due to for example memory access another thread can very fast start executing on the free Core with very little penalty. Read more on Hyper threading. In Reality there are few scenarios where hyper threading gives a large performance gain. More modern processors handle hyper threading better than older processors.

Your benchmark showed that it was CPU bound, i.e. There was little stalls in the pipeline that would have given Hyper Threading an advantage. 50% CPU is correct has the 4 cores are working and the 4 extra are not doing anything. Turn of hyper threading in the BIOS and you will see 100% CPU.

like image 106
Nys Avatar answered Sep 19 '22 00:09

Nys


This is a quick summary of Hyperthreading

Thread switching is slow, having to stop execution, copy a bunch of values into memory, copy a bunch of values out of memory into the CPU, then start things going again with the new thread.

This is where your 4 virtual cores come in. You have 4 cores, that is it, but what hyperthreading allows the CPU to do is have 2 threads on a single core.

Only 1 thread can execute at a time, however when 1 thread needs to stop to do a memory access, disk access or anything else that is going to take some time, it can switch in the other thread and run it for a bit. On old processors, they basically had a bit of a sleep in this time.

So your quad core has 4 cores, which can do 1 thing at a time each, but can have a 2nd job on standby as soon as they need to wait on another part of the computer.

If your task has a lot of memory usage and a lot of CPU usage, you should see a slight decrease in total execution time, but if you are almost entirely CPU bound you will be better off sticking with just 4 threads

like image 30
Andrew Brock Avatar answered Sep 18 '22 00:09

Andrew Brock