We have quite a large and complex MVC3 project running in .NET 4.0 in Azure.
The symptoms we are experiencing is that the site becomes unresponsive and then crashes. When we go to the management portal, all are instances are in the state 'Stoppped'.
From what I understand this is the behaviour of the IIS Rapid Fail Protection kicking in and killing the app pool and NOT restarting it again.
I have used Debug Diagnostic Tool to capture a memory dump for the instance of IIS in my Cloud Service and every time it crashes, the last messages are:
[4/2/2014 1:41:52 AM] First chance exception - 0X000006B5 caused by thread with System
ID: 2856. DetailID = 3
Script Error
Error Code - 0x800A01CE
Error Source [Microsoft VBScript runtime error]
Error Description [The remote server machine does not exist or is unavailable: 'ServiceState']
Line 104, Column 2
Or
[4/2/2014 12:25:52 AM] First chance exception - 0XE06D7363 caused by thread with System ID: 3292Script Error
Error Code - 0x80070013
Error Source [Unavailable]
Error Description [Unavailable]
Line 1103, Column 4
Also, I get a very similar, if not the same, number of these exceptions as the number of Maximum Failures as defined in my Application Pool.
Things I have tried:
To me it seems like there are some exceptions that are not getting caught and are crashing the IIS Worker Process and once it hits 5 (the Maximum Failures in my app pool) it just crashes.
If anyone could shed any light on this or suggest something else I can try, I would be most grateful.
you can also configure the rapid fail protection in in statup task with something like this in in power shell script
($env:windir + "\system32\inetsrv\appcmd.exe set config /section:system.applicationHost/applicationPools /applicationPoolDefaults.failure.rapidFailProtectionInterval:'00:03:00' /commit:apphost") | Invoke-Expression
($env:windir + "\system32\inetsrv\appcmd.exe set config /section:system.applicationHost/applicationPools /applicationPoolDefaults.failure.rapidFailProtectionMaxCrashes:'15' /commit:apphost") | Invoke-Expression
The error you are getting (0x80070013) is typically defined as "The media is write protected." (although a custom component could be throwing that HResult for something completely different). The approach you are taking by collecting DebugDiag dumps is correct, but instead of taking dumps only when it crashes, configure DebugDiag to write a dump on first chance exceptions of type 0XE06D7363. This will get you a dump when your application is throwing that error and then it should be a simple matter of opening the dump in WinDBG and dumping the callstack.
Depending on how often your app crashes you may want to also run procmon to see what resource you are accessing that might be throwing a "The media is write protected." error.
Also note that you can use AzureTools to quickly get these different debugging tools onto the VM.
User contributions licensed under CC BY-SA 3.0