Random 0x0000007B BSOD, Windows Server 2008 R2 on HP Server

1

The server at my company has been acting strange for as long as I know. Since it is a production server, we rarely do a complete shutdown/restart, but when we do, at random times we got a BSOD for some several times before it finally boots back into Windows (nothing to change, just normal resets).

I expected to get a dump file after each BSOD, but strangely enough I never got one. I have checked the startup configurations in the advance settings many times to make sure that it is configured to create a dump file, but still I haven't got any so far.

The error at the BSOD is specifically like this:

0x0000007B (0xFFFFF880009A9928, 0xFFFFFFFFC0000034, 0x0000000000000000, 0x0000000000000000)

and it is running Windows Server 2008 R2 Enterprise on a HP Proliant DL120 G6 server.

I have tried the latest updates from Windows, also tried to check hardware issues and configuration, and even geet support from HP people which they said it must be the OS error.

By some googling around, some people says that it's a filter driver error (second switch of 0x34), and I tried to remove all the filter driver instance with no luck.

Any ideas how I could fix this or at least troubleshoot it?

Update: I forgot to mention, that entering safe mode (any kind of safe mode) also triggers the BSOD, so it's not an option.

windows-server-2008-r2
hp
hp-proliant
drivers
bsod
asked on Server Fault Jun 27, 2014 by Syakur Rahman • edited Jun 27, 2014 by Syakur Rahman

2 Answers

1

I would look at the dump files and see if there is an obvious way to identify a driver issue.

http://blogs.technet.com/b/askcore/archive/2008/11/01/how-to-debug-kernel-mode-blue-screen-crashes-for-beginners.aspx#3476888

http://blogs.technet.com/b/juanand/archive/2011/03/20/analyzing-a-crash-dump-aka-bsod.aspx

These steps sometimes give an obvious answer quite quickly. If not, I would not spend much time looking further with this method, because that needs very specialised knowledge. Microsoft support would be able to pursue the investigation.

answered on Server Fault Jun 27, 2014 by John Auld
1

This is likely a firmware issue with the server hardware.

Many organizations and systems administrators don't take the time to update and maintain the firmware of their HP ProLiant servers. It requires a different mindset than a Dell or Supermicro system that's less tightly-integrated.

You have an HP ProLiant DL160 G6 server, so that places the deployment date to 2008-2010, when that server and processor architecture was in wide use. A quick check of the firmware revisions and release notes shows the September 2011 update:

Problems Fixed:

Resolved an issue that may result in any of the following conditions: operating system stops responding, unexpected system reset, Blue Screen when using a Microsoft Windows operating system, kernel panic when using a Linux operating system, or Purple Screen when using VMware ESX. A message may be displayed by the operating system or logged in the Event Log when this issue occurs indicating an "Uncorrectable Machine Check Exception." However, there are instances where the system resets before the operating system displays an error message and instances where the Event Log contains no log entry when this issue occurs. This issue does not occur if the Intel C-State tech is configured to "disabled" or the C State package limit setting is set to "C1" or "C3". The system is susceptible to this issue in the default Intel C-State tech and C State package limit setting configurations.

Sounds like your problem, doesn't it?

The best approach to updating all of the firmware and components in your system (ILO, NIC, RAID, BIOS, etc.) is to download the bootable HP Service Pack for ProLiant DVD image and allow it to update everything on the server.

answered on Server Fault Jul 1, 2014 by ewwhite

User contributions licensed under CC BY-SA 3.0