Ansible Windows Update - Fails Unless Interactive Login Performed

2

Long time/first time...Using official AWS Windows 2019 AMI ("ami-0229f7666f517b31e" on "us-east-1"), we spin up a new instance and perform the a few basic tasks (Using "user_data" option) via PowerShell:

  • Start logging
Start-Transcript -Path "c:\user-data.txt"
  • Change local "Administrator" password:
$admin = [adsi]("WinNT://./administrator, user")
$admin.PSBase.Invoke("SetPassword", "${password}")
  • Enable/Configure WinRM for Ansible:
$WinRM = Invoke-WebRequest -Proxy http://"${proxy}" -UseBasicParsing https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1 | Select-Object -ExpandProperty Content
Invoke-Expression $WinRM
  • Enable RDP:
Set-ItemProperty -Path "HKLM:\System\CurrentControlSet\Control\Terminal Server" -name "fDenyTSConnections" -value 0
Enable-NetFirewallRule -DisplayGroup "Remote Desktop"
  • Configure hostname:
Rename-Computer -NewName "${name}" -Force
  • Stop logging:
Stop-Transcript
  • Reboot Computer (So hostname change is applied):
Restart-Computer

Pretty basic stuff really, nothing we can see being an issue here and have verifed each task is working as expected.

At this point we would run an Ansible playbook that does a few things (Configure services, join domain, install chocolatey, etc.) but the task that always fails is the Windows Update part.

In the troubleshooting process, we've removed everything from the Ansible playbook other than calling Windows Update using the ansible.windows collection. So at this point, the instance is deployed, the basic PowerShell tasks are ran via "user_data", and then Ansible connects and tries to run Windows Update.

The Ansible task itself is pretty basic as well:

- name: os - perform windows updates
  win_updates:
    state: installed
    category_names:
    - Application
    - Connectors
    - CriticalUpdates
    - DefinitionUpdates
    - DeveloperKits
    - FeaturePacks
    - Guidance
    - SecurityUpdates
    - ServicePacks
    - Tools
    - UpdateRollups
    log_path: c:\wu-install.log
    reboot: yes
    reboot_timeout: 600

So this is where it gets weird/annoying...When we run that we're seeing this response:

Failed to search for updates: Exception from HRESULT: 0x80240438

That happens EVERY time we try it using our desired workflow...However, what we've found is that if we once interactively before the Windows Update task is performed, it works! EVERY time!!!

My last job was 99.9% linux, so I'm a little rusty with this, but I know there are crazy/undocumented things that happen upon first (interactive) login. And I've tried a ton of different things (Disabled IPv6, Ran wsreset.exe, deleted Windows Store registry entries, restarted Windows Updates services, etc.) based on researching that error code...But the only thing I can see is that as long as I login interactively once BEFORE the Windows Update attempt runs everything is fine.

But obviously we don't want to do that ;)

Again, to be clear:

  • Deploy Instance (With "user_data" tweaks - which includes a reboot), run Ansible playbook that runs Windows Update task:
Failed to search for updates: Exception from HRESULT: 0x80240438
  • Deploy Instance (With "user_data" tweaks - which includes a reboot), login interactively (Using RDP), run Ansible playbook that runs Windows Update task:
Security Intelligence Update for Microsoft Defender Antivirus - KB2267602 (Version 1.329.1737.0)

NOTE: That's just the current response, as the AMI is only missing that update...So could/should change in the future

Also, this is not including any domain joining or anything...Although I noticed a similar pattern there. If I used the full playbook, as long as I logged in interactively using a domain user prior to the Windows Update task running it worked.

Should mention that when I login interactively, I'm doing absolutely nothing...Just login via RDP and then I don't accept any prompts or click anything.

Obviously the goal is to make it so there is no manual login needed, as that defeats the main purpose of this project. So my questions are:

  • Does anyone know WTF is happening?!? Is there a way to simulate/replicate whatever changes an interactive login perfoms? I've scoured logs but just not seeing anything obvious.
  • Assuming I can't fix it the "right" way, any ideas on how to use Ansible to perform an interactive login? I'm thinking have a task that interactively logs in, then another task that logs it out...Messy, but can't keep wasting time on this. A "user_data" tweak would work too.

Willing to try anything at this point, about the only thing I can say is we can't change the AMI being used (Other than updating it to the latest release), and the end result of the "user_data" process needs to be achieved (But can be done other ways if that's the issue). There is a proxy involved, but I don't see how that is a factor since the source of the problem seems to be interactively logging in.

amazon-web-services
powershell
ansible
windows-update
windows-server-2019
asked on Server Fault Jan 6, 2021 by Lee Wiscovitch

2 Answers

1

i have had this issue before. Not sure about a fix but a work around was to configure auto login for the ansible user. Reboot the instance. Run the update, then remove the auto login.

Very bad work around, mine was part of a Packer image creation so I could just add the auto login and then remove it after updates.

answered on Server Fault Feb 16, 2021 by Windowsbegone
0

User contributions licensed under CC BY-SA 3.0