I'm using the phrases Parallel Processing
& Multi Threading
interchangeably because I feel there is no difference between them. If I'm wrong please correct me.
I'm not a pro in Parallel Processing/Multi-threading. I'm familiar with & used .NET threads & POSIX Threads. Nothing more than that.
I was just browsing through archives of SO on multi-threading and surprised to see there are so many libraries for Multi Threading.
http://en.wikipedia.org/wiki/Template:Parallel_computing lists down APIs of well known (I'm not sure if any others exist) Multi-Threading Libraries.
- POSIX Threads
- OpenMP
- PVM
- MPI
- UPC
- Intel Threading Building Blocks
- Boost.Thread
- Global Arrays
- Charm++
- Cilk
- Co-array Fortran
- CUDA
Also, I'm surprised to see http://en.wikipedia.org/wiki/Comparison_of_Parallel_Computing_Libraries_(API) is missing.
Till now, I've never been the situation where I need to choose between these libraries. But, If I run into one such situation.
[1] The right choice of parallel library depends on the type of the target parallel machine: (1) shared memory machine (i.e., multicores) and (2) distributed memory machine (i.e., Cell, Grid computing, CUDA). You also need to consider what kind of parallel programming model you want: (1) general-purpose multithreaded applications, (2) loop-level parallelism, (3) advanced parallelism such as pipeline, (4) data-level parallelism.
First, shared memory model is just multithreaded programming as address space over all computation cores(e.g., chip multi-processors and symmetric multi-processors) is shared. No need to exchange data explicitly between threads and processes. OpenMP, Cilk, TBB are all for this domain.
Distributed memory model used to be a main parallel programming model for super computers where each separate machine (i.e., address space is not shared) is connected via tight network. MPI is the most famous programming model for it. However, this model is still existing, especially for CUDA and Cell-based programming, where memory address space is not shared. For example, CUDA separates memory of CPU and memory of GPU. You explicitly need to send data between CPU memory and GPU memory.
Next, you need to consider parallel programming model. POSIX threads are for general-purpose multithreaded programming (e.g., highly multithreaded web servers). However, OpenMP is very specialized for loop-level parallelism than general POSIX/Win32 thread API. It simplifies thread fork and join. Intel TBB supports various task-level parallelism including loops and pipelines. There is another parallelism that you could exploit: data-level parallelism. For this problem, GPGU would be better than CPU as GPGPU is specialized for data parallel workloads. There are also programming model called streaming processing.
[2] I already answered in the above.
[3] Simple. There are many different parallel/concurrent programming model and different parallel machines. So, it isn't a single problem; There are so many sub problems in parallel/concurrent programming that can't be solved by a super single programming model as of now.
[4] It depends. Seriously.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With