WCF Service hard crashing Windows 2016

1

I have a web service running in IIS-10 on a Windows Server 2016 instance within a VM Hypervisor. A separate scheduled task calls functions of that web service during off peak times in order to retrieve status updates from a third party system. The scheduled task breaks the items that need to have statuses pulled into small batches and calls a function that retrieves / updates the records in parallel via Tasks and gives a return once all Tasks have completed.

Sometimes (every third time?), during this scheduled task, the app pool that the service is running on hangs. Log4Net stops logging, requests to the service do not get a response, IIS logging for the service is not updated with requests. There are no errors recorded in either my logs or in the windows event logs. When this happens, the app pool will stay in this state indefinitely. If I recycle the App Pool that the service is running on, the service will respond normally for ~30 seconds, and then the server will do a hard restart.

After the restart the event logs show the below error:

The computer has rebooted from a bugcheck. The bugcheck was: 0x00000139 (0x0000000000000003, 0xffffd60019506680, 0xffffd600195065d8, 0x0000000000000000).

The dmp file that is generated shows the same error code and identifies the file as ntoskrnl.exe.

All drivers are fully up to date. I have made sure all tasks and requests have timeouts. I have increased server resources past the point where that could be the cause. I have adjusted the batch size of items being processed.

I am out of troubleshooting ideas and would appreciate any help I can get.

c#
vmware
windows-server-2016
iis-10
asked on Stack Overflow Jun 2, 2017 by user942620

1 Answer

1

I figured I would close this out in case anyone else has this very specific issue.

Digging through the dump, BHDRVX64.SYS (Symantec Antivirus) was on the stack immediately before the crash.

A 4 days later Symantec pushed an update https://support.symantec.com/en_US/article.INFO4367.html with a fix for the issue.

** If you hit a similar issue to this, start by uninstalling antivirus and seeing if the issue persists. After that, work through the list of kernel level processes returned by the 'fltmc' command in admin command prompt.

answered on Stack Overflow Jun 9, 2017 by user942620

User contributions licensed under CC BY-SA 3.0