Starting my kernel driver (StartService) results in 0x124 bluescreen due to a machine check exception due to L2 cache bank 6 error on processor 0

1

When I start my driver manually using StartService, or other means, or when it's loaded by the system automatically when set to auto_start causes a bluescreen before DriverEntry is reached.

Microsoft (R) Windows Debugger Version 10.0.19041.685 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Windows\Minidump\041421-18486-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available


************* Path validation summary **************
Response                         Time (ms)     Location
OK                                             c:\symbols

************* Path validation summary **************
Response                         Time (ms)     Location
Deferred                                       srv*c:\symbols*https://msdl.microsoft.com/download/symbols
Symbol search path is: srv*c:\symbols*https://msdl.microsoft.com/download/symbols
Executable search path is: c:\symbols
Windows 7 Kernel Version 7601 (Service Pack 1) MP (8 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 7601.23864.amd64fre.win7sp1_ldr.170707-0600
Machine Name:
Kernel base = 0xfffff800`0364a000 PsLoadedModuleList = 0xfffff800`0388c750
Debug session time: Wed Apr 14 10:45:37.450 2021 (UTC + 1:00)
System Uptime: 0 days 0:00:03.199
Loading Kernel Symbols
......................................................
Loading User Symbols
Mini Kernel Dump does not contain unloaded driver list
For analysis of this file, run !analyze -v
4: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
WHEA_ERROR_RECORD structure that describes the error conditon.
Arguments:
Arg1: 0000000000000000, Machine Check Exception
Arg2: fffffa801a70b8f8, Address of the WHEA_ERROR_RECORD structure.
Arg3: 0000000000000000, High order 32-bits of the MCi_STATUS value.
Arg4: 0000000000000000, Low order 32-bits of the MCi_STATUS value.

Debugging Details:
------------------

fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock

KEY_VALUES_STRING: 1

    Key  : Analysis.CPU.Sec
    Value: 1

    Key  : Analysis.DebugAnalysisProvider.CPP
    Value: Create: 8007007e on QWERTYUIOP

    Key  : Analysis.DebugData
    Value: CreateObject

    Key  : Analysis.DebugModel
    Value: CreateObject

    Key  : Analysis.Elapsed.Sec
    Value: 1

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 74

    Key  : Analysis.System
    Value: CreateObject


BUGCHECK_CODE:  124

BUGCHECK_P1: 0

BUGCHECK_P2: fffffa801a70b8f8

BUGCHECK_P3: 0

BUGCHECK_P4: 0

CUSTOMER_CRASH_COUNT:  1

PROCESS_NAME:  System

STACK_TEXT:  
fffff880`03fa25b0 fffff800`03909cd9 : fffffa80`1a70b8d0 fffffa80`19aa2660 fffff8a0`00413b10 00000000`00000000 : nt!WheapCreateLiveTriageDump+0x6c
fffff880`03fa2ad0 fffff800`037e96d7 : fffffa80`1a70b8d0 fffff800`038632f8 fffffa80`19aa2660 00000000`00000202 : nt!WheapCreateTriageDumpFromPreviousSession+0x49
fffff880`03fa2b00 fffff800`03750fd5 : fffff800`038c5bc0 00000000`00000001 fffff8a0`00413a88 fffffa80`19aa2660 : nt!WheapProcessWorkQueueItem+0x57
fffff880`03fa2b40 fffff800`036c3c85 : fffff800`03ae6200 fffff800`03750fb0 fffffa80`19aa2600 00000000`00000000 : nt!WheapWorkQueueWorkerRoutine+0x25
fffff880`03fa2b70 fffff800`03955152 : 00000000`00000000 fffffa80`19aa2660 00000000`00000080 fffffa80`19a8db10 : nt!ExpWorkerThread+0x111
fffff880`03fa2c00 fffff800`036ab926 : fffff880`03d89180 fffffa80`19aa2660 fffff880`03d940c0 00000000`00000000 : nt!PspSystemThreadStartup+0x5a
fffff880`03fa2c40 00000000`00000000 : fffff880`03fa3000 fffff880`03f9d000 fffff880`04782720 00000000`00000000 : nt!KiStartSystemThread+0x16


MODULE_NAME: GenuineIntel

IMAGE_NAME:  GenuineIntel.sys

STACK_COMMAND:  .thread ; .cxr ; kb

FAILURE_BUCKET_ID:  X64_0x124_GenuineIntel_PROCESSOR_CACHE

OS_VERSION:  7.1.7601.23864

BUILDLAB_STR:  win7sp1_ldr

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 7

FAILURE_ID_HASH:  {270f58cb-a20a-a72d-6d81-eb8c82f01f7a}

Followup:     MachineOwner
---------

4: kd> !errrec fffffa801a70b8f8
===============================================================================
Common Platform Error Record @ fffffa801a70b8f8
-------------------------------------------------------------------------------
Record Id     : 01d73112eb4e0c21
Severity      : Fatal (1)
Length        : 928
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Creator       : Microsoft
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Notify Type   : Machine Check Exception
Timestamp     : 4/14/2021 9:45:37 (UTC)
Flags         : 0x00000002 PreviousError

fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
===============================================================================
Section 0     : Processor Generic
-------------------------------------------------------------------------------
Descriptor    @ fffffa801a70b978
Section       @ fffffa801a70ba50
Offset        : 344
Length        : 192
Flags         : 0x00000001 Primary
Severity      : Fatal
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock

fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Proc. Type    : x86/x64
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Instr. Set    : x64
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Error Type    : Cache error
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Operation     : Generic
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Flags         : 0x00
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Level         : 2
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
CPU Version   : 0x00000000000906e9
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Processor ID  : 0x0000000000000000

fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
===============================================================================
Section 1     : x86/x64 Processor Specific
-------------------------------------------------------------------------------
Descriptor    @ fffffa801a70b9c0
Section       @ fffffa801a70bb10
Offset        : 536
Length        : 128
Flags         : 0x00000000
Severity      : Fatal
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock

fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Local APIC Id : 0x0000000000000000
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
CPU Id        : e9 06 09 00 00 08 10 00 - bf fb fa 7f ff fb eb bf
                00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00
                00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00

Proc. Info 0  @ fffffa801a70bb10
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock

fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
===============================================================================
Section 2     : x86/x64 MCA
-------------------------------------------------------------------------------
Descriptor    @ fffffa801a70ba08
Section       @ fffffa801a70bb90
Offset        : 664
Length        : 264
Flags         : 0x00000000
Severity      : Fatal
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock

fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
Error         : GCACHEL2_ERR_ERR (Proc 0 Bank 6)
  Status      : 0xee2000000040110a
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
  Address     : 0x00000000fef1ffc0
fffff800038370e8: Unable to get Flags value from nt!KdVersionBlock
  Misc.       : 0x0000007880010086

The driver was correctly installed using SetupOpenInfFile and SetupInstallFileW:

λ sc qc MyDriver
[SC] QueryServiceConfig SUCCESS

SERVICE_NAME: MyDriver
TYPE : 1 KERNEL_DRIVER
START_TYPE : 3 DEMAND_START
ERROR_CONTROL : 1 NORMAL
BINARY_PATH_NAME : \??\C:\Windows\system32\drivers\MyDriver.sys
LOAD_ORDER_GROUP :
TAG : 0
DISPLAY_NAME : MyDriver
DEPENDENCIES :
SERVICE_START_NAME :

I am using a kmspico activated version of Windows 7. I have debug mode on, test signing off and driver signature enforcement off, and MyDriver.sys is a basic WDM Driver.c compiled using Visual Studio 2017.

I just can't imagine why I am getting this bluescreen, and for some reason, on my build of windows 7, this is the only bluescreen I have ever got in 30 bluescreen minidumps across 4 years, all 0x124, but all require a specific software trigger, seeing as all other drivers can be started correctly. Furthermore, every single bluescreen I've ever had back to 2018 shows:

Error         : GCACHEL2_ERR_ERR (Proc 0 Bank 6)
  Status      : 0xee2000000040110a
  Address     : 0x00000000fef1ffc0
  Misc.       : 0x0000007880010086

But once again, my PC could run now for 150 days and it will only happen when I start this specfic driver.

This is similar but it did not help.

On the MiniDump, it says the exception occured at ntosknl.exe+4acfac which is nt!WheapCreateLiveTriageDump+0x6c, which is instruction mov [rsp+518h+Size], rdi, but this appears to be a thread created for a WHEA live dump in the event of a 0x124 and does not show the callstack that caused the exception with KeBugCheckEx at the top of the stack like it would for a normal blue screen; accordingly, I do not actually see the bluescreen because KeBugCheckEx is what shows the bluescreen and it isn't called in this scenario. Every 0x124 bluescreen has nt!WheapCreateLiveTriageDump+0x6c at the top of the stack and shows that as the address that caused the exception for some reason -- I think it's because it's the instruction after RtlCaptureContext, which is what appears as a return address on the stack, so that is the context that is captured. Usually, when you get a KeBugCheckEx bluescreen, the faulting IP shown is the one that caused the exception in the first place and will be somwhere down the callstack and seeing as that is not part of this callstack, it just uses the return address of the RtlCaptureContext frame. It somehow knows that the fauling module is GenunineIntel.sys, unless it always shows that for 0x124 -- clearly I do not think that is the module that caused the MCA.

Do not answer that I have faulty hardware -- I don't. Well, this is my CPU, and I have a different BIOS to the one stated, but I updated to a 2019 version 1.9.0 and it still blue screens with the same MCI status, but this shouldn't be relevant because it's a condition that is created by software and it only happens when I load that particular driver, so it should be resolvable without changing the hardware or the firmware.

Actually funnily enough I even got a 7E bugcheck which was the first time the blue screen ever showed a blue screen but the minidump showed 0x124...

windows
drivers
bsod
kernel
bluescreenview
asked on Super User Apr 14, 2021 by Lewis Kelsey • edited Apr 17, 2021 by Lewis Kelsey

1 Answer

0

I just ran it in a virtual machine with windbg kernel debugger connected to the virtual COM1 port via a named pipe and It broke into the debugger in, you guessed it MyDriver!__security_init_cookie+0x2d, and indeed we see a bunch of cc breakpoints after the fail branch in the image. Like I said, It wasn't reaching DriverEntry so I was expecting the error to be in GsDriverEntry. I will get back to you on what's causing this function to fail. But as for this error, it was being caused by a breakpoint exception when a debugger isn't attached -- more importantly not a hardware issue

This was the original code:

enter image description here

When you stare at it, it's pretty obvious that it's going to fail

The fix was changing the .vxproj file to Windows7 where it said Windows10

Now I have this __security_init_cookie:

enter image description here

answered on Super User Apr 15, 2021 by Lewis Kelsey • edited Apr 25, 2021 by Lewis Kelsey

User contributions licensed under CC BY-SA 3.0