One of "my" DL165 G7 Proliants has rebooted out of the blue for the second time this month. The reboot was accompanied by these system event log entries in LightsOut:
Event Type Date Time Source Description Direction
OEM -- -- -- 00 00 00 00 01 02 00 00 00 00 00 00 00 --
Generic 07/19/2013 16:40:38 NMI Detect State Asserted Assertion
Generic 07/19/2013 16:40:42 Gen ID 0x41 Run-time Stop Assertion
OEM 07/19/2013 16:40:42 000137 01 80 00 00 00 01 --
OEM 07/19/2013 16:40:42 000137 02 54 44 4f 00 01 --
OEM 07/19/2013 16:40:42 000137 02 00 00 00 00 01 --
OEM 07/19/2013 16:40:42 000137 03 00 00 00 00 01 --
OEM 07/19/2013 16:40:42 000137 03 00 00 00 00 01 --
OEM 07/19/2013 16:40:42 000137 04 00 00 00 00 01 --
OEM 07/19/2013 16:40:42 000137 04 00 00 00 00 01 --
OEM 07/19/2013 16:40:42 000137 05 00 00 00 00 01 --
OEM 07/19/2013 16:40:42 000137 05 00 00 00 00 01 --
Generic 07/19/2013 16:43:54 Gen ID 0x41 C: boot completed Assertion
OEM 07/19/2013 16:43:54 000137 00 b4 6c e9 51 00 --
I have contacted HP support to get help decoding the events, but unfortunately without any notable success - I have been told that there is no accessible documentation available. What is it trying to tell me and how do I find out what is broken here?
Edit: the system is running Hyper-V 2012. The only useful event concerning the reset is Kernel-Power/41 with a BugcheckCode of 128 / 0x00000080 and BugcheckParameter1 of 0x4f4454 which match the first two OEM lines of the iLO event log (after you swap the bytes in little-endian manner, at least). The bugcheck code led me to this MSDN article which is bluntly stating that "the exact cause is difficult to determine".
In the HP support center, I could find a seemingly similar problem description with the solution being to synchronize the clocks between cluster nodes. While my breaking host indeed does run in a cluster, I have the clocks synchronized and I cannot reproduce the issue when the clocks are drifting apart (the obvious Kerberos authentication problems put aside, nothing much is happening if I desync the clocks).
The odd information I have been able to collect on the issue so far:
I had a similar problem with HP ProLiant G380 G6 and Windows 2008 R2, digging into the support and help forums got me nowhere, I eventually used the HP Smart Update Manager DVD to install all the latest updates on the server, a year and a half passed with no errors so far.
It might be a long shot, but try to use the latest updates, here's the latest HP SUM DVD
If you try to run that on a 2012 server, you might get an error that is it not compatible, according to HP that is is normal and you only need to ignore the error.
Hope this helps.
User contributions licensed under CC BY-SA 3.0