I'd like to measure the time that each thread spends doing a chunk of code. I'd like to see if my load balancing strategy equally divides chunks among workers. Typically, my code looks like the following:
#pragma omp parallel for schedule(dynamic,chunk) private(i)
for(i=0;i<n;i++){
//loop code here
}
UPDATE I am using openmp 3.1 with gcc
How do you calculate: Log time intervals when the thread starts and at swap-in and swap-out. Aggregate all of them and you'll have the execution time of your thread.
omp_get_num_threads() The omp_get_num_threads function returns the number of threads in the team currently executing the parallel region from which it is called. The function binds to the closest enclosing PARALLEL directive.
The obvious drawback of the baseline implementation that we have is that it only uses one thread, and hence only one CPU core. To exploit all CPU cores, we must somehow create multiple threads of execution.
You can just print the per-thread time this way (not tested, not even compiled):
#pragma omp parallel
{
double wtime = omp_get_wtime();
#pragma omp for schedule( dynamic, 1 ) nowait
for ( int i=0; i<n; i++ ) {
// whatever
}
wtime = omp_get_wtime() - wtime;
printf( "Time taken by thread %d is %f\n", omp_get_thread_num(), wtime );
}
NB the nowait
than removes the barrier
at the end of the for
loop, otherwise this wouldn't have any interest.
And of couse, using a proper profiling tool is a way better approach...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With