Tensorflow C++ API: How to pass input parameters to CUDA kernel

Question

Tensorflow C++ API: How to pass input parameters to CUDA kernel

I'm quite new to CUDA/C++ programming and I'm stuck at passing the input parameters to the CUDA Kernel from the Tensorflow C++ API.

First off I register the following Op:

REGISTER_OP("Op")
.Attr("T: {float, int64}")
.Input("in: T")
.Input("angles: T")
.Output("out: T");

Afterwards I want to pass the second Input (angles) through to the CPU/GPU Kernel. Somehow the following implementation works fine for the CPU implementation but throws an error in Python when I run it on my GPU... Python Error message:

Process finished with exit code -1073741819 (0xC0000005)

This is how I'm trying to access the value of the Input. Note that the input for "angles" is allways a single value (float or int):

void Compute(OpKernelContext* context) override {
...
const Tensor &input_angles = context->input(1);
auto angles_flat = input_angles.flat<float>();
const float N = angles_flat(0);
...
}

Calling the CPU/GPU Kernels as follows:

...
Functor<Device, T>()(
            context->eigen_device<Device>(),
            static_cast<int>(input_tensor.NumElements()),
            input_tensor.flat<T>().data(),
            output_tensor->flat<T>().data(),
            N);
...

As I said before, running this Op on the CPU works just how I it want to, but when I run it on the GPU I always get the abovementioned Python Error... Does someone know how to fix this? I can only guess that I'm trying to access a wrong address on the GPU with angles_flat(0)... So if anybody can help me out here it would be highly appreciated!!

python

c++

tensorflow

custom-operator

asked on Stack Overflow Nov 11, 2020 by

L. Brasi • edited Nov 12, 2020 by

talonmies

0 Answers

Nobody has answered this question yet.

User contributions licensed under CC BY-SA 3.0