How are pointers resolved across heaps in C++

0

I am basically calling a C++ DLL using LoadLibrary() in my C++ application. The application causes random 0xc0000005 (Access violation) Errors. I have done a lot of study on DLL's having their own heaps and their problems.

Things I've made sure to do so far:

In the DLL :

  1. All allocations are done in C++ standards. (no usage of malloc or calloc)
  2. All new's have a reachable equivalent delete.
  3. There is no memory allocated inside the DLL that is freed in the Host exe or vice versa.
  4. Data transfer between the two is done via POD (char* specifically). No STL's.
  5. All exported function's have a calling convention of __stdcall
  6. The DLL is built specifying extern "C" and a DEF file.

In the Host Exe:

  1. Allocated memory using HeapAlloc() with GetProcessHeap()
  2. The pointer is passed to the DLL which copies bytes on it using memcpy()
  3. DLL function typedef's are correct.
  4. Compilers for both the DLL and the exe are same.(built in VS2010).

The crashes occur at random locations :

  1. While debugging I observed that just as we step over "}" function end brace in the DLL, the exception occurs.
  2. After successfully returning from the DLL call. Crash occurs randomly.

All the Event logs show "Faulting module name" to be the DLL.

Taking into account all the points that I have stated previously, I would appreciate if anyone guided me on where to look for cause of the exception.

Also does the pointer I send to the DLL get resolved to the correct HEAP in memcpy()?. The data is correct in the host exe though. GetProcessHeaps() return 4 HEAPS.

EDIT Cannot post the full code due to policies. (again, make note that I have accounted for most of the common mistakes made).

Function where the error occurs (DLL)

extern "C"  void __stdcall BuildApplicationsList();

Typedef in exe

typedef void(__stdcall *buildAppsList)(void);

UPDATE

In response to @RalfFriedl. You were right!. the program crashes in this location.

}

5822593F  mov         byte ptr [esp+7A0h],7  
58225947  cmp         dword ptr [esp+0A0h],0  
5822594F  jne         BuildApplicationsList+1CE2h (58225992h)  
58225951  mov         eax,dword ptr [esp+74h]  
58225955  test        eax,eax  
58225957  je          BuildApplicationsList+1CB1h (58225961h)  
58225959  mov         ecx,dword ptr [eax]  
5822595B  mov         edx,dword ptr [ecx+8]    // Crash Occurs here. 
5822595E  push        eax  
5822595F  call        edx  
58225961  mov         eax,dword ptr [esp+70h]  
58225965  test        eax,eax  
58225967  je          BuildApplicationsList+1CC1h (58225971h)  
58225969  mov         ecx,dword ptr [eax]  
5822596B  mov         edx,dword ptr [ecx+8]  
5822596E  push        eax  
5822596F  call        edx  
58225971  mov         eax,dword ptr [esp+6Ch]  
58225975  test        eax,eax  
58225977  je          BuildApplicationsList+1CD1h (58225981h)  
58225979  mov         ecx,dword ptr [eax]  
5822597B  mov         edx,dword ptr [ecx+8]  
5822597E  push        eax  
5822597F  call        edx  
58225981  call        dword ptr [__imp__CoUninitialize@0 (5823F2C8h)]  

edx and ecx are 0 and obviously accessing 0x00000008 is a violation. Where to next?

c++
memory
dll
asked on Stack Overflow Aug 17, 2018 by Suraj S • edited Aug 17, 2018 by Suraj S

1 Answer

0

When you step over the end of the function, the local destructors are called. Switch to debugging the assembler code and find out which destructor causes the problem.

You issue is probably not related to the separation between main program and DLL, but just a reference to an invalidated pointer that might occur anyway.

Edit

58225951  mov         eax,dword ptr [esp+74h]  
58225955  test        eax,eax  
58225957  je          BuildApplicationsList+1CB1h (58225961h)  
58225959  mov         ecx,dword ptr [eax]  
5822595B  mov         edx,dword ptr [ecx+8]    // Crash Occurs here. 

At esp+74h you have a local variable, it seems to be a pointer to a class with virtual function. The value of this pointer is nonzero. But as destructors are not called for the target of a pointer, you probably have a class that encapsulates a pointer and calls delete on the pointer from the class destructor. The problem is probably that the destructor for the target object has already been called and the space freed.

Just before stepping into the end of the function, find out the value of esp+74h. That is the address esp+74h, not the value stored at that address. Check the addresses of all local variables. On of them must be equal to esp+74h. This is the destructor that causes the problem.

You could also try to disable inlining. Then your debugger probably stops at the correct place.

answered on Stack Overflow Aug 17, 2018 by RalfFriedl • edited Aug 18, 2018 by RalfFriedl

User contributions licensed under CC BY-SA 3.0