SCCM PXE reboots after loading drivers - VM has IP address but cant ping gateway

1

The background

I'm trying to deploy a desktop image using an SCCM task sequence and PXE. The PXE server is a Server 2016 Hyper-V VM. It has the SCCM Site Server and DP roles installed. It's a remote branch server connected to the primary site server by a site-to-site VPN. This is a new site (ie office site not SCCM site) which was brought online less than a month ago. When this SCCM server was first set up on site - imaging worked fine for both virtual and physical machines.

The primary site server SCCM version is 5.00.8692.1000 build 8692.

The SCCM DP is on the server VLAN which has an SVI on the core switch. The client I am trying to image is a Hyper-V VM on the build VLAN, which also has an SVI on the same switch. There are no ACLs on the switch. All traffic can route between the two VLANs without traversing the firewall - so pretty basic. The site-to-site VPN is configured so that the whole subnet (10.20.0.0/16 and 10.21.0.0/16) are marked as interesting traffic on each side and the VPN tunnel successfully establishes between these two subnets (as well as some other subnets which have legacy servers).

On the SCCM build VLAN, I have 3 IP Helpers - the first two are for the two DHCP servers, the third IP helper points to the PXE server - the SCCM DP.

The problem

Now this is where things get weird. I have two test VMs for Windows 10, one on either of the two Hyper-V hosts. The one on the opposite Hyper-V host to the PXE server isn't getting an IP address from DHCP - let's ignore that for now. The one that shares the same Hyper-V host as the PXE server gets DHCP and PXE boots. However when Windows PE has loaded and it says 'starting network connections' or something similar, it reboots. So I get to see the Win PE splash screen with my custom background image, then it reboots before the task sequence selection. I've managed to F8 it at this point and I can see it has an IP address on the correct subnet, with the correct gateway and DNS settings. However if I try to ping anything, even just the gateway, i just get 'Request timed out' (see image). Also the switch cannot ping the VM.

Command prompt with IP details and ping

When I look in the smsts.log I see the error 'Unable to download PXE variable file. Exit code=14. Will retry' then finally 'PxeGetPxeData failed with 0x8004016c'. Unfortunately I can't copy the logs off there as I don't have any network connectivity so I've taken a screen shot to share here.

smsts.log

I've mirrored the port and captured the packets using wireshark... After everything has gone kaputt I get this when trying to ping:

    No.     Time           Source                Destination           Protocol Length Info
     52 23.358377      10.21.6.16            10.21.6.1             ICMP     74     Echo (ping) request  id=0x0001, seq=24/6144, ttl=128 (no response found!)

Frame 52: 74 bytes on wire (592 bits), 74 bytes captured (592 bits) on interface 0
    Interface id: 0 (\Device\NPF_{6DE772D9-E6E3-4680-AF4F-E95F725E7CFD})
        Interface name: \Device\NPF_{6DE772D9-E6E3-4680-AF4F-E95F725E7CFD}
        Interface description: Ethernet 3
    Encapsulation type: Ethernet (1)
    Arrival Time: Aug 16, 2019 09:35:16.573093000 GMT Daylight Time
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1565944516.573093000 seconds
    [Time delta from previous captured frame: 1.300275000 seconds]
    [Time delta from previous displayed frame: 1.300275000 seconds]
    [Time since reference or first frame: 23.358377000 seconds]
    Frame Number: 52
    Frame Length: 74 bytes (592 bits)
    Capture Length: 74 bytes (592 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:icmp:data]
    [Coloring Rule Name: ICMP]
    [Coloring Rule String: icmp || icmpv6]
Ethernet II, Src: Microsof_00:8e:06 (00:15:5d:00:8e:06), Dst: Netgear_c4:65:28 (b0:b9:8a:c4:65:28)
    Destination: Netgear_c4:65:28 (b0:b9:8a:c4:65:28)
        Address: Netgear_c4:65:28 (b0:b9:8a:c4:65:28)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: Microsof_00:8e:06 (00:15:5d:00:8e:06)
        Address: Microsof_00:8e:06 (00:15:5d:00:8e:06)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.21.6.16, Dst: 10.21.6.1
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
        0000 00.. = Differentiated Services Codepoint: Default (0)
        .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    Total Length: 60
    Identification: 0x7daa (32170)
    Flags: 0x0000
        0... .... .... .... = Reserved bit: Not set
        .0.. .... .... .... = Don't fragment: Not set
        ..0. .... .... .... = More fragments: Not set
        ...0 0000 0000 0000 = Fragment offset: 0
    Time to live: 128
    Protocol: ICMP (1)
    Header checksum: 0x9cdc [validation disabled]
    [Header checksum status: Unverified]
    Source: 10.21.6.16
    Destination: 10.21.6.1
Internet Control Message Protocol
    Type: 8 (Echo (ping) request)
    Code: 0
    Checksum: 0x4d43 [correct]
    [Checksum Status: Good]
    Identifier (BE): 1 (0x0001)
    Identifier (LE): 256 (0x0100)
    Sequence number (BE): 24 (0x0018)
    Sequence number (LE): 6144 (0x1800)
    [No response seen]
        [Expert Info (Warning/Sequence): No response seen to ICMP request]
            [No response seen to ICMP request]
            [Severity level: Warning]
            [Group: Sequence]
    Data (32 bytes)

0000  61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70   abcdefghijklmnop
0010  71 72 73 74 75 76 77 61 62 63 64 65 66 67 68 69   qrstuvwabcdefghi
        Data: 6162636465666768696a6b6c6d6e6f707172737475767761…
        [Length: 32]

I have more wireshark logs but don't want to dump it all here. Thanks in advance to anyone who is able to help!

pxe-boot
sccm
asked on Server Fault Aug 16, 2019 by iamkl00t

1 Answer

0

Just in case anyone else has this problem - I've finally managed to fix it.

Someone had patched our 4G backup router into the SCCM build VLAN – this meant that it was assigning external IP addresses to the clients via DHCP, during the middle of the deployment. This wasn’t easy to pick up as A. it was intermittent and B. it didn’t happen at the boot screen, it happened in the middle of the boot sequence when PXE reloaded the network drivers just before showing the task sequence menu.  I actually only found out because I saw that the machine I was using to mirror the port with Wireshark actually picked up one of these external IPs.Frustrating as this whole issue was caused by someone's carelessness when patching 😡

answered on Server Fault Aug 19, 2019 by iamkl00t

User contributions licensed under CC BY-SA 3.0