Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use GCC 5.1 and OpenMP to offload work to Xeon Phi

Background

We have been trying unsuccessfully to use the new GCC 5.1 release to offload OpenMP blocks to the Intel MIC (i.e. the Xeon Phi). Following the GCC Offloading page, we've put together the build.sh script to build the "accel" target compiler for "intelmic" and the host compiler. The compilation appears to complete successfully.

Using the env.sh script we then attempt to compile the simple hello.c program listed below. However, this program seems to only run on the host and not the target device.

As we are new to offloading in general, as well as compiling GCC, there are multiple things we could be doing incorrectly. However, we've investigated the resources already mentioned plus the following (I do not have enough rep to post the links):

  • Offloading for Xeon Phi
  • Xeon Phi Tutorial
  • Intel Xeon Phi Offload Programming Models

The biggest problem is they usually reference the Intel compiler. While we plan to purchase a copy, we do NOT currently have a copy. In addition, the majority of our development pipeline is already integrated with GCC and we'd prefer to keep it that way (if possible).

We have installed the latest MPSS 3.5 distribution, making the necessary modifications to work under Ubuntu. We can successfully communicate and check the status of the Xeon Phis in our system.

In our efforts, we never saw any indication that the code was running in the mic emulation mode either.

Questions

  1. Has anyone successfully built a host/target GCC compiler combination that actually offloads to the Xeon Phi? If so, what resources did you use?
  2. Are we missing anything in the build script?
  3. Is there anything wrong with the test source code? They compile with no errors (except what is mentioned below) and run with 48 threads (i.e. the number of logical threads in the host system).
  4. Since Google search does not reveal much, does anyone have suggestions for the next step (besides giving up on GCC offloading)? Is this a bug?

Thanks!

build.sh

#!/usr/bin/env bash                                                                                                                                           

set -e -x
unset LIBRARY_PATH

GCC_DIST=$PWD/gcc-5.1.0

# Modify these to control where the compilers are installed                                                                                                   
TARGET_PREFIX=$HOME/gcc
HOST_PREFIX=$HOME/gcc

TARGET_BUILD=/tmp/gcc-build-mic
HOST_BUILD=/tmp/gcc-build-host

# i dropped the emul since we are not planning to emulate!                                                                                                    
TARGET=x86_64-intelmic-linux-gnu
# should this be a quad (i.e. pc)?? default (Ubuntu) build seems to be x86_64-linux-gnu                                                                       
HOST=x86_64-pc-linux-gnu

# check for the GCC distribution                                                                                                                              
if [ ! -d $GCC_DIST ]; then
    echo "gcc-5.1.0 distribution should be here $PWD"
    exit 0
fi

#sudo apt-get install -y libmpfr-dev libgmp-dev libmpc-dev libisl-dev dejagnu autogen sysvbanner                                                              

# prepare and configure the target compiler                                                                                                                   
mkdir -p $TARGET_BUILD
pushd $TARGET_BUILD
$GCC_DIST/configure \
    --prefix=$TARGET_PREFIX \
    --enable-languages=c,c++,fortran,lto \
    --enable-liboffloadmic=target \
    --disable-multilib \
    --build=$TARGET \
    --host=$TARGET \
    --target=$TARGET \
    --enable-as-accelerator-for=$HOST \
    --program-prefix="${TARGET}-"
    #--program-prefix="$HOST-accel-$TARGET-" \                                                                                                                
# try adding the program prefix as HINTED in the https://gcc.gnu.org/wiki/Offloading                                                                          
# do we need to specify a sysroot??? Wiki says we don't need one... but it also says "better to configure as cross compiler....                               

# build and install                                                                                                                                           
make -j48 && make install
popd

# prepare and build the host compiler                                                                                                                         
mkdir -p $HOST_BUILD
pushd $HOST_BUILD
$GCC_DIST/configure \
    --prefix=$HOST_PREFIX \
    --enable-languages=c,c++,fortran,lto \
    --enable-liboffloadmic=host \
    --disable-multilib \
    --build=$HOST \
    --host=$HOST \
    --target=$HOST \
    --enable-offload-targets=$TARGET=$TARGET_PREFIX

make -j48 && make install
popd

env.sh

#!/usr/bin/env bash

TARGET_PREFIX=$HOME/gcc
HOST_PREFIX=$HOME/gcc
HOST=x86_64-pc-linux-gnu
VERSION=5.1.0

export LD_LIBRARY_PATH=/opt/intel/mic/coi/host-linux-release/lib:/opt/mpss/3.4.3/sysroots/k1om-mpss-linux/usr/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOST_PREFIX/lib:$HOST_PREFIX/lib64:$HOST_PREFIX/lib/gcc/$HOST/$VERSION:$LD_LIBRARY_PATH
export PATH=$HOST_PREFIX/bin:$PATH

hello.c (version 1)

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[]) 
{
  int nthreads, tid;
  /* Fork a team of threads giving them their own copies of variables */

#pragma offload target (mic)
  {
#pragma omp parallel private(nthreads,tid)
    {
      /* Obtain thread number */
      tid = omp_get_thread_num();
      printf("Hello World from thread = %d\n", tid);
      
      /* Only master thread does this */
      if (tid == 0) {
        nthreads = omp_get_num_threads();
        printf("Number of threads = %d\n", nthreads);
      }    
#ifdef __MIC__
      printf("on target...\n");
#else
      printf("on host...\n");
#endif    
    }
  }    
}

We compiled this code with:

gcc -fopenmp -foffload=x86_64-intelmic-linux-gnu hello.c -o hello

hello_omp.c (version 2)

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[]) 
{
  int nthreads, tid;
  /* Fork a team of threads giving them their own copies of variables */

#pragma omp target device(mic)
  {
#pragma omp parallel private(nthreads,tid)
    {
      /* Obtain thread number */
      tid = omp_get_thread_num();
      printf("Hello World from thread = %d\n", tid);
      
      /* Only master thread does this */
      if (tid == 0) {
    nthreads = omp_get_num_threads();
    printf("Number of threads = %d\n", nthreads);
      }    
#ifdef __MIC__
      printf("on target...\n");
#else
      printf("on host...\n");
#endif    
    }
  }    
}

Almost the same thing, but instead we tried the

#pragma omp target device

syntax. In fact, with mic, it complains, but with any device numbers (i.e. 0) it compiles and runs on the host. This code was compiled in the same manner.

like image 900
grumpy_robot Avatar asked Apr 24 '15 15:04

grumpy_robot


People also ask

Does GCC support OpenMP for GPU offload?

OpenMP GPU offload support in GCC is limited The GCC compiler's OpenMP offload capabilities for GPU code generation is very limited, in terms of both functionality and performance. Users are strongly advised to use LLVM/clang for C/C++ codes, or CCE, which also includes a Fortran compiler with OpenMP offload capability.

Which compiler should I use for OpenMP offloading?

Users are strongly advised to use LLVM/clang for C/C++ codes, or CCE, which also includes a Fortran compiler with OpenMP offload capability. Several compilers on the GPU nodes also support GPU offloading with OpenACC directives.

How do I use OpenMP offload in CCE?

Compiling codes using OpenMP offload capabilities in CCE requires different flags for C and C++ codes than for Fortran codes. The CCE C and C++ compilers are based on clang, and thus use similar flags that one would use for clang to generate OpenMP offload code:

Is there a CUDA compiler that supports OpenMP offloading?

If using clang as a CUDA compiler, one usually will also need to add the -I/path/to/cuda/include and -L/path/to/cuda/lib64 flags manually, since nvcc includes them implicitly. Several compilers have some support for OpenMP offloading to GPUs via the omp target directive. The clang/clang++ LLVM compilers support GPU offloading with OpenMP.


Video Answer


1 Answers

Offloading to Xeon Phi with GCC 5 is possible. In order to get it to work, one must compile liboffloadmic for native MIC target, similarly to how it is done here. The problem of your setup is that it compiles host emulation libraries (libcoi_host.so, libcoi_device.so), and sticks with emulated offloading even though the physical Xeon Phi is present.

like image 157
Dmitry Mikushin Avatar answered Dec 30 '22 10:12

Dmitry Mikushin