OpenMP vs gcc compiler optimizations

Tags:

I'm learning openmp using the example of computing the value of pi via quadature. In serial, I run the following C code:

double serial() {
    double step;
    double x,pi,sum = 0.0;

    step = 1.0 / (double) num_steps;

    for (int i = 0; i < num_steps; i++) {
        x = (i + 0.5) * step; // forward quadature
        sum += 4.0 / (1.0 + x*x);
    }
    pi = step * sum;

    return pi;
}

I'm comparing this to an omp implementation using a parallel for with reduction:

double SPMD_for_reduction() {
    double step;
    double pi,sum = 0.0;

    step = 1.0 / (double) num_steps;

    #pragma omp parallel for reduction (+:sum)
    for (int i = 0; i < num_steps; i++) {
        double x = (i + 0.5) * step;
        sum += 4.0 / (1.0 + x*x);
    }
    pi = step * sum;

    return pi;
}

For num_steps = 1,000,000,000, and 6 threads in the case of omp, I compile and time:

    double start_time = omp_get_wtime();
    serial();
    double end_time = omp_get_wtime();

    start_time = omp_get_wtime();
    SPMD_for_reduction();
    end_time = omp_get_wtime();

Using no cc compiler optimizations, the runtimes are around 4s (Serial) and .66s (omp). With the -O3 flag, serial runtime drops to ".000001s" and the omp runtime is mostly unchanged. What's going on here? Is it vector instructions being used, or is it poor code or timing method? If it's vectorization, why isn't the omp function benefiting?

It may be of interest that the machine I am using is using a modern 6 core Xeon processor.

Thanks!

958

asked Dec 16 '15 22:12

winter-muted

1 Answers

The compiler outsmarts you. For the serial version it is able to detect, that the result of your computation is never used. Therefore it throws out the computation completely.

double start_time = omp_get_wtime();
serial(); //<-- Computations not used.
double end_time = omp_get_wtime();

In the openMP case the compiler can not see if really everything inside the function body is without an effect, so to stay on the safe side it keeps the function call.

You can of course write something like double serial_pi = serial(); and outside of the time measurement do some dummy stuff with the variable serial_pi. This way the compiler will keep the function call and do the optimizations you are actually looking for.

131

answered Oct 16 '22 13:10

SamVanDonut

Related questions
                            
                                using pipe while executing command through the parent
                            
                                Is there any way to check whether a function has been declared?
                            
                                Is there a way to code a qsort comparison function for unsigned integers without using lots of branching statements?
                            
                                How to determine system value for _POSIX_PATH_MAX
                            
                                C++: Quickest way to get integer within a range
                            
                                Put a va_list variable inside... a variable argument list (!)
                            
                                Is it necessary to write a "portable" if (c == '\n') to process cross-platform files?
                            
                                text auto complete using Tab in command line
                            
                                GCC behavior for unresolved weak functions
                            
                                How can I subtract two IPv6 addresses (128bit numbers) in C/C++?
                            
                                How i get absolute path in kernel space from file descriptor
                            
                                Detect if processor has RDTSCP at compile time
                            
                                What can a pointer do that a variable can't do?
                            
                                Recursion With Memory Allocation
                            
                                What happens if I catch SIGSEGV and the signal handler causes another SIGSEGV?
                            
                                scanf in c and relation input buffer
                            
                                Order of bytes in struct
                            
                                How to get absolute path of a symbolic link?
                            
                                How to create array of fixed-length "strings" in C?
                            
                                Do any compilers transfer effective type through memcpy/memmove

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

OpenMP vs gcc compiler optimizations

Tags:

c

gcc

openmp

winter-muted

People also ask

1 Answers

SamVanDonut

Recent Activity

Donate For Us