Inverse Heisenbug - Unit test fails only when debugger is attached

4

I recently fixed a defect in our product, the symptom of which was an access violation caused by accessing a dangling pointer.

For good practice I added a unit test to ensure that the bug doesn't come back. When writing a unit test I will always back out my defect fix and ensure the unit test fails, otherwise I know it isn't doing its job properly.

After backing out the defect fix, I discovered that my unit test still passes (not good). When I attached a debugger to the unit test to see why it passes, the test failed (i.e. an exception was thrown) and I could break and observe that the call stack matched the one in the original defect which I fixed.

I didn't modify the "Break on exception" settings in Visual Studio 2005, and this is indeed a critical Win32 exception which causes the test harness to terminate (i.e. there is no graceful exception handler).

The text of the exception is:

Unhandled exception at 0x0040fc59 in _testcase.exe: 0xC0000005:
Access violation reading location 0xcdcdcdcd.

Note: The location isn't always 0xcdcdcdcd (allocated but unwritten Win32 heap memory). Sometimes it is 0x00000000, and sometimes it is another address.

This seems like the inverse of a traditional Heisenbug, where a problem goes away when observing it via a debugger. In my case, observing it via the debugger makes the problem appear!

My initial thought was that this was a race condition exposed by the timing differences in the debugger. However, when I added tracing to the code and ran it separately from the debugger, the data that I am printing out indicates to me that the application should be aborting in a similar manner to when running under the debugger. But it is not!

Any suggestions as to what could be causing this?


Update: I am narrowing in on the cause of this problem. See this question for more details. Will update this question with the answer if I find it.

c++
unit-testing
debugging
debug-build
asked on Stack Overflow Nov 25, 2010 by LeopardSkinPillBoxHat • edited Aug 17, 2019 by Joshua

4 Answers

3

Generally, the VC++ debugger will fill heap-allocated memory with some known value when you delete the pointer to that memory. It's been quite a while since I've used Visual Studio, but it seems reasonable to me that 0xcdcdcdcd could be such a value. It seems most likely to me that the application is crashing properly when running in the debugger. When running in Release mode the runtime doesn't waste time overwriting deallocated memory, so some of the time you get "lucky" and the data stored in that memory is still valid.

You can modify your build setting to turn on the option for filling deallocated memory with a known value in Release mode (don't forget to turn it off again when you're done). I'd guess if you did this your application would crash in Release mode.

I appreciate that the value isn't always 0xcdcdcdcd, which may mean I'm wrong or may mean you have more than one path to a dangling pointer.

answered on Stack Overflow Nov 25, 2010 by Adam Milligan
2

I ran into this years ago in reverse: The problem was only occurring when the debugger was not attached.

It turned out the code was corrupting the stack-frame of the previous method activation and using the debugger introduced an intermediate stack-frame.

You possibly have a similar situation.

answered on Stack Overflow Nov 25, 2010 by Adrian Pronk
0

I don't know if this will help you any, but I once ran into a bug which would manifest differently if the program was run under the Visual Studio debugger, or program was run externally, then had debugger attached.

answered on Stack Overflow Nov 25, 2010 by Hasturkun
0

I have isolated the cause of this problem - see this question for details.

When running my test harness under the debugger, the memory consumed by the debugging environment meant that subsequent allocations/deallocations of the same object were always allocated in different parts of memory. This meant that when my test harness tried to access a dangling pointer, it crashed the test (technically this is undefined behaviour but this is test code and it seems to do what I need it to do).

When running my test harness from the command line, subsequent allocations/deallocations of the same object always re-used the same block of memory. This coincedental behaviour meant that when I accessed what was in actuality a dangling pointer in my test case, it happened that the dangling pointer still pointed to a valid object. That's why I didn't see a crash.

answered on Stack Overflow Nov 25, 2010 by LeopardSkinPillBoxHat • edited May 23, 2017 by Community

User contributions licensed under CC BY-SA 3.0