Wait for kernel to finish OpenCL

1

My OpenCL program doesn't always finish before further host (c++) code is executed. The OpenCL code is only executed up to a certain point (which apperears to be random). The code is shortened a bit, so there may be a few things missing.

cl::Program::Sources sources;
string code = ResourceLoader::loadFile(filename);
sources.push_back({ code.c_str(),code.length() });

program = cl::Program(OpenCL::context, sources);

if (program.build({ OpenCL::default_device }) != CL_SUCCESS)
{
    exit(-1);
}
queue = CommandQueue(OpenCL::context, OpenCL::default_device);
kernel = Kernel(program, "main");
Buffer b(OpenCL::context, CL_MEM_READ_WRITE, size);
queue.enqueueWriteBuffer(b, CL_TRUE, 0, size, arg);
buffers.push_back(b);
kernel.setArg(0, this->buffers[0]);

vector<Event> wait{ Event() };

Version 1:

queue.enqueueNDRangeKernel(kernel, NDRange(), range, NullRange, NULL, &wait[0]);

Version 2:

queue.enqueueNDRangeKernel(kernel, NDRange(), range, NullRange, &wait, NULL);

.

wait[0].wait();

queue.finish();

Version 1 just does not wait for the OpenCL program. Version 2 crashes the program (at queue.enqueueNDRangeKernel):

Exception thrown at 0x51D99D09 (nvopencl.dll) in foo.exe: 0xC0000005: Access violation reading location 0x0000002C.

How would one make the host wait for the GPU to finish here?

EDIT: queue.enqueueNDRangeKernel returns -1000. While it returns 0 on a rather small kernel

c++
events
opencl
wait
asked on Stack Overflow Mar 5, 2016 by Addi • edited Mar 5, 2016 by Addi

2 Answers

1

Version 1 says to signal wait[0] when the kernel is finished - which is the right thing to do.

Version 2 is asking your clEnqueueNDRangeKernel() to wait for the events in wait before it starts that kernel [which clearly won't work].

On it's own, queue.finish() [or clFinish()] should be enough to ensure that your kernel has completed.

Since you haven'd done clCreateUserEvent, and you haven't passed it into anything else that initializes the event, the second variant doesn't work.

It is rather bad that it crashes [it should return "invalid event" or some such - but presumably the driver you are using doesn't have a way to check that the event hasn't been initialized]. I'm reasonably sure the driver I work with will issue an error for this case - but I try to avoid getting it wrong...

I have no idea where -1000 comes from - it is neither a valid error code, nor a reasonable return value from the CL C++ wrappers. Whether the kernel is small or large [and/or completes in short or long time] shouldn't affect the return value from the enqueue, since all that SHOULD do is to enqueue the work [with no guarantee that it starts until a queue.flush() or clFlush is performed]. Waiting for it to finish should happen elsewhere.

I do most of my work via the raw OpenCL API, not the C++ wrappers, which is why I'm referring to what they do, rather than the C++ wrappers.

answered on Stack Overflow Mar 5, 2016 by Mats Petersson
0

I faced a similar problem with OpenCL that some packages of a data stream we're not processed by OpenCL.

I realized it just happens while the notebook is plugged into a docking station.

Maybe this helps s.o. (No clFlush or clFinish calls)

answered on Stack Overflow Feb 7, 2019 by Thomas Langer

User contributions licensed under CC BY-SA 3.0