NVCC -G device debug flag breaks CUDA code with critical section

-2

I am experiencing some odd behavior with some CUDA code that runs a critical section, and the compiling flags passed to nvcc.

Specifically the problem is that if the code is compiled with the -G (--device-debug) flag, the critical section doesn't executes for all the threads, in fact I believe there would be a deadlock if it weren't for the counter in the while.

The strange thing is that without the -G flag, the critical section does executes for all threads without deadlock or any other issue.

The code is:

//can_write is in device memory (device RAM)
__device__ void critical_section(int *can_write)
{
    int t=0,v=0;
    while(t<4096)
    {
         v=atomicCAS(&(can_write),1,0);
         if(v!=0)
         {
             //critical section, without synchronization points
             atomicExch(&(can_write),1);
             return;
          }
          ++t;
    }
}

My platform is:

  • CUDA toolkit 10.0
  • GCC 7.4.0
  • CUDA driver 410.73
  • Compute capability 5.0, device 960M
  • Debian GNU/Linux buster/sid
  • kernel 4.19.0-1-amd64

Edit

The code with the counter, is just to avoid the program with the -G flag, entering the alleged deadlock. Without the counter I needed to terminate the program, because it went for 20 s without terminating. The kernel calling the function is:

__global__ void kernel(int *cwrite)
{
        critical_section(cwrite);
 }

I believe there is a deadlock, because even when the kernel is launched with <<<1,256>>>, it gets stuck. And without the -G flag, the program lasts just a few milli seconds.


Solution to avoid the odd behavior:

__device__ void critical_section(int *can_write)
{
    int v=0,executed=0,ready=0;
    while(ready==0)
    {
         ready=__all_sync(0xFFFFFFFF,executed);
         if(executed==0)
         {
              v=atomicCAS(&(can_write),1,0);
              if(v!=0)
              {
                   //critical section, without synchronization points
                   atomicExch(&(can_write),1);
                   executed=1;
              }
              ++t;
         }
    }
}
cuda
critical-section
gpu-atomics
asked on Stack Overflow Jan 4, 2019 by fabian_mc • edited Jan 4, 2019 by fabian_mc

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0