When I pass an array from Fortran to C, the array's address is incorrect in C. I've checked this by printing the address of the array in Fortran before the CALL
, then stepping into the C function and printing the address of the argument.
0x9acd44c0
0xffffffff9acd44c0
The upper dword of the C pointer has been set to 0xffffffff
. I'm trying to understand why this is happening, and only happening on the HPC cluster and not on a development machine.
I'm using a rather large scientific program written in Fortran/C++/CUDA. On some particular machine, I get a segfault when calling a C function from Fortran. I've found that a pointer is being passed to the C function with some bytes set incorrectly.
Every Fortran file in the program includes a common header file which sets up some options and declares the common blocks.
IMPLICIT REAL*8 (A-H,O-Z)
COMMON/NBODY/ X(3,NMAX), BODY(NMAX)
COMMON/GPU/ GPUPHI(NMAX)
The Fortran call site looks like this:
CALL GPUPOT(NN,BODY(IFIRST),X(1,IFIRST),GPUPHI)
And the C function, which is compiled by nvcc
, is declared like so:
extern "C" void gpupot_(int *n,
double m[],
double x[][3],
double pot[]);
I found from debugging that the value of the pointer to pot
is incorrect; so any attempt to access that array will segfault.
When I ran the program with gdb, I put a break point just before the call to gpupot
and printed the value of the GPUPHI
variable:
(gdb) p &GPUPHI
$1 = (PTR TO -> ( real(kind=8) (1050000))) 0x9acd44c0 <gpu_>
I then let the debugger step into the gpupot_
C function, and inspected the value of the pot
argument:
(gdb) p pot
$2 = (double *) 0xffffffff9acd44c0
All of the other arguments have the correct pointer values.
The compiler options that are set for gfortran
are:
-fPIC -O3 -ffast-math -Wall -fopenmp -mcmodel=medium -march=native -mavx -m64
And nvcc
is using the following:
-ccbin=g++ -Xptxas -v -ftz=true -lineinfo -D_FORCE_INLINES \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_35,code=compute_35 -Xcompiler \
"-O3 -fPIC -Wall -fopenmp -std=c++11 -fPIE -m64 -mavx \
-march=native" -std=c++14 -lineinfo
For debugging, the -O3
is replaced with -g -O0 -fcheck=all -fstack-protector -fno-omit-frame-pointer
, but the behaviour (crash) remains the same.
This is prefaced by my top comments [and yours].
It looks like you're getting an [unwanted] sign extension of the address.
gfortran
is being built with -mcmodel=medium
but C does not.
With that option, larger symbols/arrays will be linked above 2GB [which has the sign bit set]
So, add the option to both or leave it off both to fix the problem.
User contributions licensed under CC BY-SA 3.0