Our EC2 Instance (Windows Server 2008) crashed multiple times for the past 3 months (last time was today at 1:05 EST). Upon reviewing MEMORY.DMP file we noticed that possible cause of the crashes is rhelnet.sys (RedHat PV NIC Driver).
Server's Event Viewer has the following records right after the crash:
Critical - Kernel Power:
The system has rebooted without cleanly shutting down first.
This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
BugCheck:
The computer has rebooted from a bugcheck. The bugcheck was:
0x000000d1 (0x000000000000002d, 0x0000000000000002, 0x0000000000000000, 0xfffff88001402d14).
A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: 100113-35849-01.
Could this be a hardware issue? Would it help if we stop and start the instance? Or is this more likely that this is caused by the software running on the system?
[Update 10.01.2013]
Amazon Rep suggested to update RH drivers to Citrix PV drivers on our instance:
[Update 10.08.2013]
We performed a drivers upgrade on the cloned instance. Right after the upgrade we noticed the following errors in our Event viewer:
Xennet6 errors in Event Viewer (Event ID# 5001)
After digging a bit more I found this article suggesting to install the latest Citrix drivers. Unfortunately, this didn't help us at all and our cloned instance became unresponsive.
[Update 10.08.2013 2]
I recreated an instance and updated PV drivers again. After searching on Internet I found this article where Amazon Rep explains that:
"Event ID 5001 from source Xennet6 cannot be found" message does not
indicate anything wrong, just that the PV driver is looking for a feature
that we have not implemented in our version of Xen.
I will keep my test system running for a while to see if there any issues with it.
Upgrading drivers as suggested by Amazon Rep fixed the isuse.
In regards to Event ID 5001...
issue below is the reply I've got from Amazon:
Please ignore the Xennet 5001 error. This error occurs on every instance
that is launched with Citrix PV drivers and is due to the driver looking
for a feature that is not supported on EC2. It will have no other effect on the instance.
I got same issue.
But AWS Supporter answer me as below, They don't sure issue from Citrix PV drives.
Currently, we are unable to root cause the issue.
In my personal opinion, this might be a one-time only occurrence,
but as you are running Citrix PV Drivers, I highly encourage you to upgrade.
As the Citrix drivers show up in the logs,
they might had been related to the issue.
User contributions licensed under CC BY-SA 3.0