I've got a strange problem relating to basic Windows Scheduled tasks that has baffled me for a few weeks now. These jobs fail to run on some servers, but work on others which are running on different hardware/VMs platforms. Initially this was a problem we spotted deep within one of our production systems, but I have managed to simplify it down so it works with minimal changes from 'out of box' configuration. I've actually created a 5 line batch file to make minimal changes to a clean installation to set this up, to make sure each test machine is identical.
The hardware that works
We have built VMs on VMware, XenServer, and on physical HP servers and other models of Dell server (Poweredge R730, R720, R430). Most of these use spinning 10k/15k SAS disks, though one of the HP servers I built using SATA SSDs in RAID 10 as a test.
The hardware that doesn't work
Our new servers have problems however. These are new Dell Poweredge R540s. They have built in BOSS RAID 1 controllers (M.2 RAID 1 SSDs basically), with SAS SSDs as additional fast storage via a PERC controller.
On the older hardware, you can see the scheduled task running if you manually trigger it, though obviously you don't see notepad actually open if it's running as a different user.
On the Poweredge R540s however the task fails to start, giving error code 2147943726 (0x8007052e). I believe this is an 'unknown username or bad password' error, despite the credentials being correct, and the user account having been freshly created.
The task fails to run manually, and the following security event is audited in the Security Event Log:
Log Name: Security Source: Microsoft-Windows-Security-Auditing Date: 12/10/2018 17:30:00 Event ID: 4625 Task Category: Logon Level: Information Keywords: Audit Failure User: N/A Computer: <computername> Description: An account failed to log on. Subject: Security ID: SYSTEM Account Name: <computername>$ Account Domain: <domainname> Logon ID: 0x3E7 Logon Type: 4 Account For Which Logon Failed: Security ID: NULL SID Account Name: @@CyBAAA.....<this is a long Base 62 ID, so I've removed it in case it contains sensitive information> Account Domain: Failure Information: Failure Reason: Unknown user name or bad password. Status: 0xC000006D Sub Status: 0xC0000064 Process Information: Caller Process ID: 0x590 Caller Process Name: C:\Windows\System32\svchost.exe Network Information: Workstation Name: <computername> Source Network Address: - Source Port: - Detailed Authentication Information: Logon Process: Advapi Authentication Package: Negotiate Transited Services: - Package Name (NTLM only): - Key Length: 0
Yesterday, I rebuilt 2x R540s, 2x VMs and 1x HP Server, and only the R540s had this fault. This is predictable - we've tried re-imaging all our test machines and each time the result is the same.
Other relevant findings
I can make the R540s work correctly if they are built directly into the 'computers' OU on AD, meaning they don't get our Security Baseline policy. The tasks install and run perfectly. If I then move the computer object into the same OU as all the rest of the machines we're testing with, the tasks stop working. Moving the object back out of the OU into the computers OU does not make the tasks work again. Clearly something is being changed, but I can't see what, and I don't know why it would only impact R540s in this way and not VMs or other models of hardware.
We have exported and compared the local security policy on working and non-working machines to check for differences. Those which exist are minor, and when the working policy set is imported to a broken machine, it stays broken. Similarly, importing the policy set from a broken machine to a working machine does not break the working machine.
If I change the scheduled task so that 'Do not store Password' is ticked, the task does run, but this won't work for us in production as the task needs access to non-local resources.
If I change the scheduled task to run in the context of my domain admin account (while giving my self 'logon as batch' and 'logon as service' rights), it works even with 'Do not store password' unticked. So whatever is breaking, it seems to be related only to local user accounts.
Other things I have tried which made no difference:
I think something must be being changed by our group policy which is stopping this from working - probably the security baseline in some way. What I'm at a complete loss to explain however is why this only seems to break the operation of scheduled tasks on a certain hardware model, which should have identical configurations to working machines created at the same time, in the same way. I can't see what would cause that.
Does anyone know if scheduled tasks or basic security authentication uses hardware features in any way? Does TPM have anything to do with this? Is there anything else I can do to trace back what is causing the user account to fail at the point the task runs? I've run out of ideas.
Also, in case anyone asks, I have tried doing a 'runas' from a command line to prove the local user account I've created works, and that the password I've used is correct.
The issue appears to have been caused by Device Guard/Credential Guard being enabled via our security baseline policy. Device Guard is only set to run if Secure Boot is enabled in the BIOS, meaning we didn't see it on VMs or older servers which didn't support Secure Boot to begin with.
There's an article showing the exact same issue, here:
As mentioned in the article above, the solution is to have the task run as the System user if that will suffice, or to disable Device Guard if that is not an option.
We have tested with Device Guard disabled, and once we recreated our scheduled task, it ran successfully with Secure Boot still enabled in the BIOS.
Now we know why it wasn't working, we can now try to read up on Device Guard/Credential Guard to see how it should be working with these enabled, and what the best practice is going forward.
Thanks to Greg Askew for his input which ultimately led to this discovery!
User contributions licensed under CC BY-SA 3.0