GPU number threads not showing in OpenMP runtime

Ask Question

Asked 3 years, 4 months ago

Modified 3 years, 4 months ago

Viewed 359 times

I am trying out offloading an array calculation with GPU (GTX 1080Ti) using OpenMP and C++ on this dummy code that I have written:

#include <omp.h>
#include <iostream>

using namespace std;

int main(){

        //int totalSum, ompSum;
        int totalSum=0, ompSum=0;
        const int N = 1000;
        int array[N];
        for (int i=0; i<N; i++){
                array[i]=i;
        }
        #pragma omp target
        {
                #pragma omp parallel private(ompSum) shared(totalSum)
                {
                        ompSum=0;
                        omp_set_num_threads(100);
                        printf ( "Total number of threads are %d!\n", omp_get_num_threads() );
                        #pragma omp for
                        for (int i=0; i<N; i++){
                                ompSum += array[i];
                        }

                        #pragma omp critical
                        totalSum += ompSum;

                }

                printf ( "Caculated sum should be %d but is %d\n", N*(N-1)/2, totalSum );
        }
        return 0;


}

Upon running the code, this is the output I get:

Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Caculated sum should be 499500 but is 499500

The calculated sum is correct but I am curious why it shows only 8 threads compared to the 100 threads which I have set in the code.

When setting the omp_set_num_threads right below the #pragma omp target, the runtime will report

libgomp: cuCtxSynchronize error: an illegal memory access was encountered

I am new with OpenMP, I would greatly appreciate if someone could help explain this issue.

asked Aug 6, 2021 at 5:19

OMEGOSH01

534 bronze badges

1

It's invalid to set the thread count inside a parallel region.
– paddy
Commented Aug 6, 2021 at 5:31
@paddy I tried setting outside, no change as well
– OMEGOSH01
Commented Aug 6, 2021 at 5:32
Are you sure that you have correctly setup GPU (and is OpenMP actually using it)? Please check the return value of int omp_is_initial_device(void); function inside target region. If the return value is 0, then you run this code on GPU, otherwise on CPU.
– Laci
Commented Aug 6, 2021 at 6:05
It is returning a value of 0, which I presume is the GPU @Laci
– OMEGOSH01
Commented Aug 6, 2021 at 6:17
3

Why is there a need to set the number of GPU threads? AFAIK, GCC will use one thread, per warp, so the correct writing of the offload directive would be #pragma omp target teams distribute parallel for simd reduction(+:totalSum) map(from:totalSum) map(to:array[:N])
– Michael Klemm
Commented Aug 6, 2021 at 6:37

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Collectives™ on Stack Overflow

GPU number threads not showing in OpenMP runtime

0

Your Answer

Browse other questions tagged
c++
multithreading
g++
openmp
offloading
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Browse other questions tagged c++multithreadingg++openmpoffloading or ask your own question.

Browse other questions tagged
c++
multithreading
g++
openmp
offloading
or ask your own question.