Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between "CPU OpenCL Project" and "GPU OpenCL Project"

Tags:

c++

opencl

I installed the Intel OpenCL SDK and I wanted to create a project. Visual Studio 2017 showed me those two options and a third "Empty OpenCL Project". I don't know what the difference between the two is. I tried to look through the template code but since I don't (yet) know anything about OpenCL I couldn't understand their difference.

License header:

/*****************************************************************************
 * Copyright (c) 2013-2016 Intel Corporation
 * All rights reserved.
 *
 * WARRANTY DISCLAIMER
 *
 * THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
 * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS
 * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
 * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
 * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
 * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
 * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING
 * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE
 * MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 *
 * Intel Corporation is the author of the Materials, and requests that all
 * problem reports or change requests be submitted to it directly
 *****************************************************************************/

I ran a diff as suggested:

625,629c625,626
<     // Create new OpenCL buffer objects
<     // As these buffer are used only for read by the kernel, you are recommended to create it with flag CL_MEM_READ_ONLY.
<     // Always set minimal read/write flags for buffers, it may lead to better performance because it allows runtime
<     // to better organize data copying.
<     // You use CL_MEM_COPY_HOST_PTR here, because the buffers should be populated with bytes at inputA and inputB.
---
>     cl_image_format format;
>     cl_image_desc desc;
631c628,650
<     ocl->srcA = clCreateBuffer(ocl->context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, sizeof(cl_uint) * arrayWidth * arrayHeight, inputA, &err);
---
>     // Define the image data-type and order -
>     // one channel (R) with unit values
>     format.image_channel_data_type = CL_UNSIGNED_INT32;
>     format.image_channel_order     = CL_R;
> 
>     // Define the image properties (descriptor)
>     desc.image_type        = CL_MEM_OBJECT_IMAGE2D;
>     desc.image_width       = arrayWidth;
>     desc.image_height      = arrayHeight;
>     desc.image_depth       = 0;
>     desc.image_array_size  = 1;
>     desc.image_row_pitch   = 0;
>     desc.image_slice_pitch = 0;
>     desc.num_mip_levels    = 0;
>     desc.num_samples       = 0;
> #ifdef CL_VERSION_2_0
>     desc.mem_object        = NULL;
> #else
>     desc.buffer            = NULL;
> #endif
> 
>     // Create first image based on host memory inputA
>     ocl->srcA = clCreateImage(ocl->context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, &format, &desc, inputA, &err);
634c653
<         LogError("Error: clCreateBuffer for srcA returned %s\n", TranslateOpenCLError(err));
---
>         LogError("Error: clCreateImage for srcA returned %s\n", TranslateOpenCLError(err));
638c657,658
<     ocl->srcB = clCreateBuffer(ocl->context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, sizeof(cl_uint) * arrayWidth * arrayHeight, inputB, &err);
---
>     // Create second image based on host memory inputB
>     ocl->srcB = clCreateImage(ocl->context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, &format, &desc, inputB, &err);
641c661
<         LogError("Error: clCreateBuffer for srcB returned %s\n", TranslateOpenCLError(err));
---
>         LogError("Error: clCreateImage for srcB returned %s\n", TranslateOpenCLError(err));
645,649c665,666
<     // If the output buffer is created directly on top of output buffer using CL_MEM_USE_HOST_PTR,
<     // then, depending on the OpenCL runtime implementation and hardware capabilities, 
<     // it may save you not necessary data copying.
<     // As it is known that output buffer will be write only, you explicitly declare it using CL_MEM_WRITE_ONLY.
<     ocl->dstMem = clCreateBuffer(ocl->context, CL_MEM_WRITE_ONLY | CL_MEM_USE_HOST_PTR, sizeof(cl_uint) * arrayWidth * arrayHeight, outputC, &err);
---
>     // Create third (output) image based on host memory outputC
>     ocl->dstMem = clCreateImage(ocl->context, CL_MEM_WRITE_ONLY | CL_MEM_USE_HOST_PTR, &format, &desc, outputC, &err);
652c669
<         LogError("Error: clCreateBuffer for dstMem returned %s\n", TranslateOpenCLError(err));
---
>         LogError("Error: clCreateImage for dstMem returned %s\n", TranslateOpenCLError(err));
734c751,755
<     cl_int *resultPtr = (cl_int *)clEnqueueMapBuffer(ocl->commandQueue, ocl->dstMem, true, CL_MAP_READ, 0, sizeof(cl_uint) * width * height, 0, NULL, NULL, &err);
---
>     size_t origin[] = {0, 0, 0};
>     size_t region[] = {width, height, 1};
>     size_t image_row_pitch;
>     size_t image_slice_pitch;
>     cl_int *resultPtr = (cl_int *)clEnqueueMapImage(ocl->commandQueue, ocl->dstMem, true, CL_MAP_READ, origin, region, &image_row_pitch, &image_slice_pitch, 0, NULL, NULL, &err);
783c804
<     cl_device_type deviceType = CL_DEVICE_TYPE_CPU;
---
>     cl_device_type deviceType = CL_DEVICE_TYPE_GPU;

I could also paste int the two complete source files but they are long (900 lines).

like image 396
raldone01 Avatar asked Jul 02 '18 12:07

raldone01


People also ask

Does OpenCL use CPU?

 OpenCL can use CPUs as a compute device just it can for GPUs.  There is no local memory, CPUs cache is utilized in OpenCL just like any normal CPU program.

What is OpenCL driver?

OpenCL™ (Open Computing Language) is a low-level API for heterogeneous computing that runs on CUDA-powered GPUs. Using the OpenCL API, developers can launch compute kernels written using a limited subset of the C programming language on a GPU.


1 Answers

You've sort of answered it yourself with the diff. In the diff output you can see one project uses a clBuffer object while the other uses the clImage.

Image support is optional in the OpenCL standard, so it depends on the device and driver. GPU devices may have better performance with the image type, and most if not all Intel integrated GPUs support the image types (AFAIK).

Both codes use the host pointer, which works well on Intel devices as the iGPU and CPU can address the same memory, or at least behave that way. However, this may not always be optimal for discrete GPUs.

like image 173
Andreas Gravgaard Andersen Avatar answered Oct 17 '22 00:10

Andreas Gravgaard Andersen