.NET application crash with 0x80131506 error (ExecutionEngineException)

1

Our .NET app crashes randomly with ExecutionEngineException error. App targets .NET 4.8 x64.

After some intensive reproduction attempts I've collected following facts.

  1. We used WinDBG Preview with Time Travel function to capture execution history.
  2. We analyzed few crashes of app collected with AdPlus where break point was set to:

clr!EEPolicy::HandleFatalError

  1. App is executed with these environment variables set:
COMPLUS_HeapVerify = 1 
COMPlus_GCStress = 16
  1. As additional stress test we added periodic (once per 10 seconds) LOH compaction:
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced, false);

GC.WaitForPendingFinalizers();

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced, true);

GC.WaitForPendingFinalizers();
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.Default;
  1. App crash call stack looks like this in all cases:
000000d4`997b6718 00007ff8`9d82a290 clr!EEPolicy::HandleFatalError+0x0
000000d4`997b6720 00007ff8`9d9fa1dc clr!VerifyObjectAndAge+0x8c
000000d4`997b6750 00007ff8`9d9f1f1c clr!GCToEEInterface::WalkAsyncPinned+0x83
000000d4`997b6790 00007ff8`9d9fa07e clr!BlockVerifyAgeMapForBlocksWorker+0x76
000000d4`997b67d0 00007ff8`9d9f9fe5 clr!BlockVerifyAgeMapForBlocks+0x45
000000d4`997b6800 00007ff8`9d3cedcf clr!TableScanHandles+0x1ff
000000d4`997b68d0 00007ff8`9d9f2507 clr!HndVerifyTable+0xcb
000000d4`997b6970 00007ff8`9d94b2d4 clr!Ref_VerifyHandleTable+0xb4
000000d4`997b6a00 00007ff8`9d94a57f clr!WKS::gc_heap::verify_heap+0x7f9
000000d4`997b6b20 00007ff8`9d760f7e clr!WKS::gc_heap::garbage_collect+0x37231e
000000d4`997b6b60 00007ff8`9d3f0c37 clr!WKS::GCHeap::GarbageCollectGeneration+0xef
000000d4`997b6bb0 00007ff8`9d53ae61 clr!WKS::GCHeap::GarbageCollect+0x91
000000d4`997b6c00 00007ff8`9d536d2d clr!GCInterface::Collect+0x6a
000000d4`997b6c90 00007ff8`9c7d39db mscorlib_ni+0xdb39db
000000d4`997b6d40 00007ff8`76fcfa6d Company_App_Processor_ni+0x50fa6d
000000d4`997bc8a0 00007ff8`3de6d8c9 unknown!unknown+0x0
000000d4`997bc950 00007ff8`76f300bb Company_App_Processor_ni+0x4700bb
000000d4`997bc9e0 00007ff8`7718edec Company_App_Processor_ni+0x6cedec
000000d4`997bcf90 00007ff8`3de6d8c9 unknown!unknown+0x0
000000d4`997bd040 00007ff8`7718a3e0 Company_App_Processor_ni+0x6ca3e0
000000d4`997bd0d0 00007ff8`77189483 Company_App_Processor_ni+0x6c9483
000000d4`997bd170 00007ff8`9d396923 clr!CallDescrWorkerInternal+0x83
000000d4`997bd1b0 00007ff8`9d396838 clr!CallDescrWorkerWithHandler+0x4e
000000d4`997bd1f0 00007ff8`9d4d654c clr!CallDescrWithObjectArray+0x705
000000d4`997bd460 00007ff8`9d4d6050 clr!CStackBuilderSink::PrivateProcessMessage+0x26d
000000d4`997bd900 00007ff8`9bf626d3 mscorlib_ni+0x5426d3
000000d4`997bd9b0 00007ff8`9bf6243d mscorlib_ni+0x54243d
000000d4`997bda00 00007ff8`9bf6226a mscorlib_ni+0x54226a
000000d4`997bda70 00007ff8`9bf61bc2 mscorlib_ni+0x541bc2
000000d4`997bdaf0 00007ff8`9d396923 clr!CallDescrWorkerInternal+0x83
000000d4`997bdb30 00007ff8`9d396838 clr!CallDescrWorkerWithHandler+0x4e
000000d4`997bdb70 00007ff8`9d419067 clr!DispatchCallDebuggerWrapper+0x1f
000000d4`997bdbd0 00007ff8`9d419035 clr!DispatchCallSimple+0x93
000000d4`997bdc70 00007ff8`9d4d4d6c clr!ThreadNative::InternalCrossContextCallback+0x34c
000000d4`997be040 00007ff8`9bf61dfc mscorlib_ni+0x541dfc
000000d4`997be0b0 00007ff8`9bf6c06a mscorlib_ni+0x54c06a
000000d4`997be110 00007ff8`9bf60eee mscorlib_ni+0x540eee
000000d4`997be160 00007ff8`9bf6b95b mscorlib_ni+0x54b95b
000000d4`997be1c0 00007ff8`9d396923 clr!CallDescrWorkerInternal+0x83
000000d4`997be200 00007ff8`9d396838 clr!CallDescrWorkerWithHandler+0x4e
000000d4`997be240 00007ff8`9d419067 clr!DispatchCallDebuggerWrapper+0x1f
000000d4`997be2a0 00007ff8`9d419035 clr!DispatchCallSimple+0x93
000000d4`997be340 00007ff8`9d4d4d6c clr!ThreadNative::InternalCrossContextCallback+0x34c
000000d4`997be710 00007ff8`9bf60d48 mscorlib_ni+0x540d48
000000d4`997be770 00007ff8`9bf60764 mscorlib_ni+0x540764
000000d4`997be7f0 00007ff8`9bf60644 mscorlib_ni+0x540644
000000d4`997be860 00007ff8`9bf6015a mscorlib_ni+0x54015a
000000d4`997be920 00007ff8`9bf5fcef mscorlib_ni+0x53fcef
000000d4`997be9e0 00007ff8`9d394c12 clr!CTPMethodTable__CallTargetHelper3+0x12
000000d4`997bea10 00007ff8`9d4a9cac clr!CallTargetWorker2+0x85
000000d4`997bea70 00007ff8`9d4d5016 clr!TransparentProxyStubWorker+0x2a3b6
000000d4`997bec70 00007ff8`9d394b55 clr!TransparentProxyStub_CrossContext+0x55
000000d4`997bed30 00007ff8`3de91e43 unknown!noop+0x0
000000d4`997bf130 00007ff8`9bf79bd1 mscorlib_ni+0x559bd1
000000d4`997bf170 00007ff8`9bf78e46 mscorlib_ni+0x558e46
000000d4`997bf210 00007ff8`9d396923 clr!CallDescrWorkerInternal+0x83
000000d4`997bf250 00007ff8`9d396838 clr!CallDescrWorkerWithHandler+0x4e
000000d4`997bf290 00007ff8`9d3970e8 clr!MethodDescCallSite::CallTargetWorker+0x102
000000d4`997bf390 00007ff8`9d39c10a clr!QueueUserWorkItemManagedCallback+0x2a
000000d4`997bf480 00007ff8`9d397ce0 clr!ManagedThreadBase_DispatchInner+0x40
000000d4`997bf4c0 00007ff8`9d397c53 clr!ManagedThreadBase_DispatchMiddle+0x6c
000000d4`997bf5c0 00007ff8`9d397b92 clr!ManagedThreadBase_DispatchOuter+0x4c
000000d4`997bf630 00007ff8`9d397d77 clr!ManagedThreadBase_FullTransitionWithAD+0x2f
000000d4`997bf690 00007ff8`9d39c057 clr!ManagedPerAppDomainTPCount::DispatchWorkItem+0xa4
000000d4`997bf810 00007ff8`9d3978a7 clr!ThreadpoolMgr::ExecuteWorkRequest+0x64
000000d4`997bf840 00007ff8`9d39777f clr!ThreadpoolMgr::WorkerThreadStart+0xf6
000000d4`997bf8e0 00007ff8`9d39b5c5 clr!Thread::intermediateThreadProc+0x8b
000000d4`997bfda0 00007ff8`ac4b13d2 kernel32!BaseThreadInitThunk+0x22
000000d4`997bfdd0 00007ff8`ae7354f4 ntdll!RtlUserThreadStart+0x34
  1. After crash analyses we observer that it crashes when doing object age verification.
  2. We investigated available sources from CLR VM and SSCLI (handletablescan.cpp).
  3. Application uses WCF ServiceHost and NamedPipes channel.
  4. It crashes for the following reason:
    • Age verification kicks in (VerifyObjectAndAge) and it finds OverlappedData object on gen 2. Also it seems that is has current 'clump' age of 2 (from BlockVerifyAgeMapForBlocksWorker source), which is used as minAge.
    • Then it finds m_userObject field value, which equals to object[], which is set in: System.ServiceModel.Channels.OverlappedContext..ctor() by call to this.nativeOverlapped = this.overlapped.UnsafePack(completeCallback, this.bufferHolder);
    • that object[] is configured as: this.bufferHolder = new object[] { dummyBuffer };
    • dummyBuffer is defined as private static byte[] dummyBuffer; in OverlappedContext.
    • Which is allocated on ephemeral segment of CLR heap.
    • Then call to: GCHeap::GetGCHeap()->WhichGeneration(obj);
      returns 0, because 'dummyBuffer' is not on GC generation 0, 1 or 2.
    • VerifyObjectAndAge calls: EEPOLICY_HANDLE_FATAL_ERROR(COR_E_EXECUTIONENGINE) because minAge (expected from clump is 2), but 'dummyBuffer' age is reported as 0.

Question: Why that is happening? How to address it?

Side notes: app uses new AppDomain to load its modules. Crash more frequently happens if app tries to unload and load AppDomain with worker logic. However we have crashes without domain re-load operations. Also VerifyObjectAndAge logic should have been executed more frequently but it fails only after some time.

Thank you so much for any ideas.

.net
wcf
clr
executionengineexception
asked on Stack Overflow Apr 26, 2020 by LAT • edited Apr 26, 2020 by LAT

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0