openacc vs openmp & mpi differences ?

Tags:

I was wondering what are the major differences between openacc and openmp. What about MPI, cuda and opencl ? I understand the differences between openmp and mpi, especially the part about shared and distributed memory Do any of them allow for a hybrid gpu-cpu processing setup ?

430

asked Oct 21 '13 12:10

Sid5427

1 Answers

OpenMP and OpenACC enable directive-based parallel programming.

OpenMP enables parallel programming on shared-memory computing platforms, as for example multi-core CPUs. It is very easy to use, since it is sufficient to tell the compiler some directives (code annotations, or pragmas) on how to extract the parallelism which triggers the synthesis of a parallel version of the input source code.

An example of OpenMP "Hello World" program with pragmas is the following

Click to copy

#include <omp.h> #include <stdio.h> #include <stdlib.h>  int main (int argc, char *argv[])  {   int nthreads, tid;    /* Fork a team of threads giving them their own copies of variables */   #pragma omp parallel private(nthreads, tid)    {      /* Obtain thread number */      tid = omp_get_thread_num();      printf("Hello World from thread = %d\n", tid);       /* Only master thread does this */      if (tid == 0)       {         nthreads = omp_get_num_threads();         printf("Number of threads = %d\n", nthreads);      }    }  /* All threads join master thread and disband */  }

The source of the above code is OpenMP Exercise from where you will find many other examples. In this "Hello World" example, the master thread will output the number of involved threads, while each thread will print Hello World from thread = xxx.

OpenACC is a collection of compiler directives to specify parts of a C/C++ or Fortran code to be accelerated by an attached accelerator, as a GPU. It follows pretty much the same philosophy of OpenMP and enables creating high-level host+accelerator programs, again without the need of managing the accelerator programming language. For example, OpenACC will let you simply accelerate existing C/C++ codes without needing to learn CUDA (with some performance penalty, of course).

A typical OpenACC code will resemble the following

Click to copy

#pragma acc kernels loop gang(32), vector(16) for (int j=1; j<n-1; j++) { #pragma acc loop gang(16), vector(32)     for (int i=1; i<m-1; i++)     {        Anew[j][i] = 0.25f * (A[j][i+1] + A[j-1][i]);        ...     } }

The above source code is taken from the blog An OpenACC Example (Part 1), where you could find some more useful material to understand the difference between OpenMP and OpenACC.

Other sources are the following

How does the OpenACC API relate to the OpenMP API?.

OpenACC and OpenMP directives

Shane Cook, CUDA Programming, Morgan Kaufmann (Chapter 10)

Due to its very nature, OpenACC enables hybrid CPU+GPU programming. You can also mix OpenMP and OpenACC directives. For example, in a 4-GPU system, you can create 4 CPU threads to offload computing work to the 4 available GPUs. This is described in the Shane Cook book. However, it should be mentioned that OpenMP 4.0 foresees also directives for offloading work to attached accelerators, see

OpenMP Technical Report 1 on Directives for Attached Accelerators

answered Sep 24 '22 00:09

Vitality

Related questions
                            
                                How to calculate Gflops of a kernel
                            
                                Why is "a =(b>0)?1:0" better than "if-else" version in CUDA?
                            
                                Gustafson's law vs Amdahl's law
                            
                                NVIDIA CUDA Video Encoder (NVCUVENC) input from device texture array
                            
                                Nvcc missing when installing cudatoolkit?
                            
                                CUDA vs OpenCL performance comparison
                            
                                CUDA: Tiled matrix-matrix multiplication with shared memory and matrix size which is non-multiple of the block size
                            
                                From thrust::device_vector to raw pointer and back?
                            
                                NVidia CUDA toolkit 7.5.27 failing to install on OS X
                            
                                Difference with CUDA Hardware Quadro 4000 Vs. GeForce 480
                            
                                Have you successfully used a GPGPU? [closed]
                            
                                help me understand cuda
                            
                                How is CUDA memory managed?
                            
                                Is there a maximum number of streams in CUDA?
                            
                                How to find cuda version in ubuntu?
                            
                                Explanation of CUDA C and C++
                            
                                Using CUDA with Visual Studio 2017
                            
                                Why can't libcudart.so.4 be found when compiling the CUDA samples under Ubuntu?
                            
                                CUDA / OpenCL within a Virtual Machine / Hypervisor [closed]
                            
                                CUDA for .net?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

openacc vs openmp & mpi differences ?

Tags:

cuda

mpi

openmp

opencl

openacc

Sid5427

People also ask

1 Answers

Vitality

Recent Activity

Donate For Us