What steps are required to debug a memory.dmp? (walkthrough included)

2

I woke up to this in my eventlog today:

The computer has rebooted from a bugcheck.  
The bugcheck was: 0x000000ef (0xffffe0018668f080, 0x0000000000000000, 0x0000000000000000, 0x0000000000000000). 
A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: 082615-29515-01.

I'm using this MSFT article as a guide on how to debug it.

  1. First I search for the meaning of 0x000000ef which is Critical Process Died

  2. Try using visual studio, as the article suggests but get the error debugging older format crash dumps is not supported

  3. Install WDK 8.1 install for a 2012 R2 server running Exchange

  4. Open WinDBG, located in: C:\Program Files (x86)\Windows Kits\8.1\Debuggers\x64

  5. Set the symbol server to srv*c:\cache*http://msdl.microsoft.com/download/symbols;

  6. Open the dmp file and get this output:

Output

Executable search path is: 
Windows 8 Kernel Version 9600 MP (32 procs) Free x64
Product: Server, suite: TerminalServer SingleUserTS
Built by: 9600.17936.amd64fre.winblue_ltsb.150715-0840
Machine Name:
Kernel base = 0xfffff801`c307c000 PsLoadedModuleList = 0xfffff801`c33517b0
Debug session time: Wed Aug 26 08:58:08.719 2015 (UTC - 4:00)
System Uptime: 0 days 8:12:03.493
Loading Kernel Symbols
...............................................................
................................................................
...................
Loading User Symbols
................................................................
................................................................
..............................................
Loading unloaded module list
.....
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck EF, {ffffe0018668f080, 0, 0, 0}

*** WARNING: Unable to verify checksum for System.ni.dll
Probably caused by : wininit.exe

Followup: MachineOwner

Type in !analyze

23: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

CRITICAL_PROCESS_DIED (ef)
        A critical system process died
Arguments:
Arg1: ffffe0018668f080, Process object or thread object
Arg2: 0000000000000000, If this is 0, a process died. If this is 1, a thread died.
Arg3: 0000000000000000
Arg4: 0000000000000000

Debugging Details:
------------------


PROCESS_OBJECT: ffffe0018668f080

IMAGE_NAME:  wininit.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  0

MODULE_NAME: wininit

FAULTING_MODULE: 0000000000000000 

PROCESS_NAME:  msexchangerepl

BUGCHECK_STR:  0xEF_msexchangerepl

DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT

CURRENT_IRQL:  0

ANALYSIS_VERSION: 6.3.9600.17237 (debuggers(dbg).140716-0327) amd64fre

MANAGED_STACK: !dumpstack -EE
OS Thread Id: 0x0 (23)
TEB information is not available so a stack size of 0xFFFF is assumed
Current frame: 
Child-SP         RetAddr          Caller, Callee

LAST_CONTROL_TRANSFER:  from fffff801c368e160 to fffff801c31cb9a0

STACK_TEXT:  
**privacy** : nt!KeBugCheckEx
**privacy** : nt!PspCatchCriticalBreak+0xa4
**privacy** : nt! ?? ::NNGAKEGL::`string'+**privacy**
**privacy** : nt!PspTerminateProcess+0xe5
**privacy** : nt!NtTerminateProcess+0x9e
**privacy** : nt!KiSystemServiceCopyEnd+0x13
**privacy** : ntdll!NtTerminateProcess+0xa
**privacy**: KERNELBASE!TerminateProcess+0x25
**privacy** : System_ni+**privacy**


STACK_COMMAND:  kb

FOLLOWUP_NAME:  MachineOwner

IMAGE_VERSION:  

FAILURE_BUCKET_ID:  0xEF_msexchangerepl_IMAGE_wininit.exe

BUCKET_ID:  0xEF_msexchangerepl_IMAGE_wininit.exe

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:0xef_msexchangerepl_image_wininit.exe

FAILURE_ID_HASH:  {9cb4f9d6-5f45-6583-d4ab-0dae45299dee}

Followup: MachineOwner
---------

Question

  1. Should I run this on the Exchange Server itself?
  2. Did analyze get the Exchange symbols from the public MSFT server?
  3. Did !analyze figure out the meaning of 0xffffe0018668f080? Is that a memory address of a failing process? How do I locate that process?
  4. Is it necessary for me to mark **privacy** for the internet? I didn't recognize the contents.
  5. Does Visual Studio ever work in opening memory dumps?
  6. What should I have done differently in analyzing this?
  7. What should I do next?
exchange-2013
debugging
dump
bsod
windbg
asked on Server Fault Aug 26, 2015 by halfbit

1 Answer

2
  1. No. Dumps can be analyzed offline (just like you did).
  2. Yes, assuming you've got the symbol server setting correct.
  3. Yes, that is the address of the PEB of the failing process. The process name is given in your analysis. You can get the complete PEB by typing !PEB 0xffffe0018668f080 in Windbg. The image name and process name are confusing to me though. The exchange process has crashed the wininit process but I would not expect both names in the PEB. Perhaps someone with more knowledge can chime in to clear (my) misunderstanding of things.
  4. I have no idea where that comes from. Never seen that before.
  5. Also no idea
  6. Nothing afaik. You have done all the steps needed to find the culprit
  7. Use your favorite search engine to try and find similar events. Searching on msexchangerepl and winit finds following possible relevant link:Exchange and BugChecks. Exchange apparently crashes wininit intentionally when writing to the event log fails for a long period of time.

The hung IO detection feature in Exchange 2010 is designed to make recovery from hung IO or a hung controller fast, rather than re-trying or waiting until the storage stack raises an error that causes failover. It’s a great addition to the set of high availability features built into Exchange 2010.

answered on Server Fault Aug 27, 2015 by Lieven Keersmaekers

User contributions licensed under CC BY-SA 3.0