Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In-depth analysis of the difference between the CPU and GPU [closed]

I've been searching for the major differences between a CPU and a GPU, more precisely the fine line that separates the cpu and gpu. For example, why not use multiple cpus instead of a gpu and vice versa. Why is the gpu "faster" in crunching calculations than the cpu. What are some types of things that one of them can do and the other can't do or do efficiently and why. Please don't reply with answers like " Central processing unit " and " "Graphics processing unit". I'm looking for a in-depth technical answer.

like image 723
Mike G Avatar asked Oct 07 '11 16:10

Mike G


2 Answers

GPUs are basically massively parallel computers. They work well on problems that can use large scale data decomposition and they offer orders of magnitude speedups on those problems.

However, individual processing units in a GPU cannot match a CPU for general purpose performance. They are much simpler and do not have optimizations like long pipelines, out-of-order execution and instruction-level-parallelizaiton.

They also have other drawbacks. Firstly, you users have to have one, which you cannot rely on unless you control the hardware. Also there are overheads in transferring the data from main memory to GPU memory and back.

So it depends on your requirements: in some cases GPUs or dedicated processing units like Tesla are the clear winners, but in other cases, your work cannot be decomposed to make full use of a GPU and the overheads then make CPUs the better choice.

like image 106
Nick Butler Avatar answered Jan 12 '23 11:01

Nick Butler


First watch this demonstration:

http://www.nvidia.com/object/nvision08_gpu_v_cpu.html

That was fun!

So what's important here is that the "CPU" can be controlled to perform basically any calculation on command; For calculations that are unrelated to each other, or where each computation is strongly dependent on its neighbors (rather than merely the same operaton), you usually need a full CPU. As an example, compiling a large C/C++ project. The compiler has to read each token of each source file in sequence before it can understand the meaning of the next; Just because there are lots of source files to process, they all have different structure, and so the same calculations don't apply accros the source files.

You could speed that up by having several, independent CPU's, each working on separate files. Improving the speed by a factor of X means you need X CPU's which will cost X times as much as 1 CPU.


Some kinds of task involve doing exactly the same calculation on every item in a dataset; Some physics simulations look like this; in each step, each 'element' in the simulation will move a little bit; the 'sum' of the forces applied to it by its immediate neighbors.

Since you're doing the same calculation on a big set of data, you can repeat some of the parts of a CPU, but share others. (in the linked demonstration, the air system, valves and aiming are shared; Only the barrels are duplicated for each paintball). Doing X calculations requires less than X times the cost in hardware.

The obvious disadvantage is that the shared hardware means that you can't tell a subset of the parallel processor to do one thing while another subset does something unrelated. the extra parallel capacity would go to waste while the GPU performs one task and then another different task.

like image 44
SingleNegationElimination Avatar answered Jan 12 '23 12:01

SingleNegationElimination