How to debug intermitent network outage on Citrix XenServer

1

I have two Citrix XenServers in a pool. Both have a NIC for Management, a NIC for the SAN and a NIC for the VMs to connect to the network. The NIC for the VMs (eth3) has multiple networks configured for the VMs with different VLAN IDs.

This has been working well for 2 years. Recently, I started getting dropped pings to the various VMs which I'm monitoring.

When this has happened, I immediately checked the VMs and couldn't see anything work. For example, one of those VM is running Windows 2008 and I had an RDP session. I got the alert and alt-tab'ed to my RDP session and it was fine.

One time, the outage lasted several minutes. But all other 5-6 times in the last 3 weeks, the outage was so quick I couldn't check anything while it was down.

I do have messages from the software running on some of those VMs that there was a network outage - i.e. the software running on VM1 was not able to connect to the database on VM2.

I looked at VM1 network adapter in Windows and it reports having been up for 500+ days. So it did not get a "link down" event.

I checked my switch to which the XenServers are connected to and I don't see anything there.

I have DVS Controller installed (but I may have done some setup wrong on this - but it has been like this for 2 years so...) and today I noticed this message shows up just before the time I get the alerts:

Established control channel for network on server 'xen1'

Then, I, I get multiple events such as:

'server1' now using IP 10.5.4.1 with interface 'mac-address'. 'server1' added interface 'mac-address' to network 'Vlan640' on server 'xen1'

It repeats those 2 entries for every VMs on xen1.

On the XenServer host, in /var/log/messages, I find the following lines (there are more but that is what seems relevant):

Feb 11 19:37:49 vm2 xapi: [ warn|vm2|45688150 unix-RPC|SR.set_virtual_allocation D:cc84effe10bb|redo_log] Could not write database to redo log: unexpected exception Sys_error("Broken pipe")
Feb 11 19:37:49 vm2 fe: 20727 (/opt/xensource/libexec/block_device_io -device /dev/VG_XenStorage-2f64491e-f3...) exitted with signal -7
Feb 11 19:38:33 vm2 block_device_io: [ info|vm2|0||block_device_io] Opened block device.
Feb 11 19:38:33 vm2 block_device_io: [ info|vm2|0||block_device_io] Accepted connection on data socket
Feb 11 19:38:34 vm2 block_device_io: [ info|vm2|0||block_device_io] Closing connection on data socket
Feb 11 19:39:20 vm2 xapi: [ warn|vm2|45688239 unix-RPC|SR.set_physical_size D:b6a4bc10ab7d|redo_log] Could not write delta to redo log: Timeout.
Feb 11 19:39:21 vm2 fe: 20845 (/opt/xensource/libexec/block_device_io -device /dev/VG_XenStorage-2f64491e-f3...) exitted with signal -7
Feb 11 19:39:53 vm2 xapi: [ warn|vm2|45688261 unix-RPC|SR.set_virtual_allocation D:499087254aa2|redo_log] Timed out waiting to connect
Feb 11 19:39:53 vm2 fe: 20972 (/opt/xensource/libexec/block_device_io -device /dev/VG_XenStorage-2f64491e-f3...) exitted with signal -7
Feb 11 19:40:12 vm2 ovsdb-server: 2283737|reconnect|ERR|ssl:10.64.240.65:6632: no response to inactivity probe after 5 seconds, disconnecting
Feb 11 19:40:12 vm2 ovsdb-server: 2283738|reconnect|INFO|ssl:10.64.240.65:6632: connection dropped
Feb 11 19:40:13 vm2 ovsdb-server: 2283739|reconnect|INFO|ssl:10.64.240.65:6632: connecting...
Feb 11 19:40:14 vm2 ovsdb-server: 2283740|reconnect|INFO|ssl:10.64.240.65:6632: connection attempt timed out
Feb 11 19:40:14 vm2 ovsdb-server: 2283741|reconnect|INFO|ssl:10.64.240.65:6632: waiting 2 seconds before reconnect
Feb 11 19:40:15 vm2 ovs-vswitchd: 106096433|rconn|ERR|xenbr0<->ssl:10.64.240.65:6633: no response to inactivity probe after 5 seconds, disconnecting
Feb 11 19:40:15 vm2 ovs-vswitchd: 106096434|rconn|ERR|xenbr1<->ssl:10.64.240.65:6633: no response to inactivity probe after 5 seconds, disconnecting
Feb 11 19:40:16 vm2 ovsdb-server: 2283742|reconnect|INFO|ssl:10.64.240.65:6632: connecting...
Feb 11 19:40:16 vm2 ovs-vswitchd: 106096441|rconn|INFO|xenbr0<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:16 vm2 ovs-vswitchd: 106096442|rconn|INFO|xenbr1<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:16 vm2 ovs-vswitchd: 106096443|rconn|ERR|xenbr2<->ssl:10.64.240.65:6633: no response to inactivity probe after 5 seconds, disconnecting
Feb 11 19:40:16 vm2 ovs-vswitchd: 106096444|rconn|ERR|xenbr3<->ssl:10.64.240.65:6633: no response to inactivity probe after 5 seconds, disconnecting
Feb 11 19:40:17 vm2 ovs-vswitchd: 106096448|rconn|INFO|xenbr0<->ssl:10.64.240.65:6633: connected
Feb 11 19:40:17 vm2 ovs-vswitchd: 106096449|rconn|INFO|xenbr1<->ssl:10.64.240.65:6633: connected
Feb 11 19:40:17 vm2 ovs-vswitchd: 106096450|rconn|INFO|xenbr2<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:17 vm2 ovs-vswitchd: 106096451|rconn|INFO|xenbr3<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:18 vm2 ovsdb-server: 2283746|reconnect|INFO|ssl:10.64.240.65:6632: connected
Feb 11 19:40:18 vm2 ovs-vswitchd: 106096455|rconn|INFO|xenbr2<->ssl:10.64.240.65:6633: connected
Feb 11 19:40:18 vm2 ovs-vswitchd: 106096456|rconn|INFO|xenbr3<->ssl:10.64.240.65:6633: connected
Feb 11 19:40:18 vm2 ovs-vswitchd: 106096459|ofp_util|INFO|Dropped 3 log messages in last 46234 seconds due to excessive rate
Feb 11 19:40:18 vm2 ovs-vswitchd: 106096460|ofp_util|INFO|normalization changed ofp_match, details:
Feb 11 19:40:18 vm2 ovs-vswitchd: 106096461|ofp_util|INFO| pre: wildcards=0xffffffff  in_port=    0  dl_src=00:00:90:06:82:03  dl_dst=00:00:00:00:00:00  dl_vlan=    0  dl_vlan_pcp=  0  dl_type=     0  nw_tos=0x30  nw_proto= 0x7  nw_src=         0  nw_dst=         0  tp_src=    0  tp_dst=    0
Feb 11 19:40:18 vm2 ovs-vswitchd: 106096462|ofp_util|INFO|post: wildcards= 0x23fffff  in_port=    0  dl_src=00:00:00:00:00:00  dl_dst=00:00:00:00:00:00  dl_vlan=    0  dl_vlan_pcp=  0  dl_type=     0  nw_tos=   0  nw_proto=   0  nw_src=         0  nw_dst=         0  tp_src=    0  tp_dst=    0
Feb 11 19:40:18 vm2 ovs-vswitchd: 106096463|ofp_util|INFO|normalization changed ofp_match, details:
Feb 11 19:40:18 vm2 ovs-vswitchd: 106096464|ofp_util|INFO| pre: wildcards=0xffffffff  in_port=    0  dl_src=00:00:ff:ff:ff:ff  dl_dst=00:00:00:00:43:6f  dl_vlan=28262  dl_vlan_pcp=105  dl_type=0x5f73  nw_tos=0x77  nw_proto=0x69  nw_src=0x685f6d67  nw_dst=0x725f6a6f  tp_src=26990  tp_dst=24421
Feb 11 19:40:18 vm2 ovs-vswitchd: 106096465|ofp_util|INFO|post: wildcards= 0x23fffff  in_port=    0  dl_src=00:00:00:00:00:00  dl_dst=00:00:00:00:00:00  dl_vlan=28262  dl_vlan_pcp=105  dl_type=     0  nw_tos=   0  nw_proto=   0  nw_src=         0  nw_dst=         0  tp_src=    0  tp_dst=    0
Feb 11 19:40:21 vm2 ovs-vswitchd: 106096473|fail_open|WARN|Could not connect to controller (or switch failed controller's post-connection admission control policy) for 15 seconds, failing open
Feb 11 19:40:21 vm2 ovs-vswitchd: 106096474|fail_open|WARN|Could not connect to controller (or switch failed controller's post-connection admission control policy) for 15 seconds, failing open
Feb 11 19:40:23 vm2 xapi: [ warn|vm2|45688290 unix-RPC|SR.set_virtual_allocation D:cf744931c44f|redo_log] Timed out waiting to connect
Feb 11 19:40:27 vm2 ovs-vswitchd: 106096508|rconn|ERR|xenbr0<->ssl:10.64.240.65:6633: no response to inactivity probe after 5 seconds, disconnecting
Feb 11 19:40:27 vm2 ovs-vswitchd: 106096509|rconn|ERR|xenbr1<->ssl:10.64.240.65:6633: no response to inactivity probe after 5 seconds, disconnecting
Feb 11 19:40:27 vm2 ovs-vswitchd: 106096510|rconn|ERR|xenbr2<->ssl:10.64.240.65:6633: no response to inactivity probe after 5 seconds, disconnecting
Feb 11 19:40:27 vm2 ovs-vswitchd: 106096511|rconn|ERR|xenbr3<->ssl:10.64.240.65:6633: no response to inactivity probe after 5 seconds, disconnecting
Feb 11 19:40:28 vm2 ovs-vswitchd: 106096512|rconn|INFO|xenbr0<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:28 vm2 ovs-vswitchd: 106096513|rconn|INFO|xenbr1<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:28 vm2 ovs-vswitchd: 106096514|rconn|INFO|xenbr2<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:28 vm2 ovs-vswitchd: 106096515|rconn|INFO|xenbr3<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:28 vm2 ovsdb-server: 2283747|reconnect|ERR|ssl:10.64.240.65:6632: no response to inactivity probe after 5.03 seconds, disconnecting
Feb 11 19:40:28 vm2 ovsdb-server: 2283748|reconnect|INFO|ssl:10.64.240.65:6632: connection dropped
Feb 11 19:40:28 vm2 ovsdb-server: 2283749|reconnect|INFO|ssl:10.64.240.65:6632: waiting 4 seconds before reconnect
Feb 11 19:40:29 vm2 ovs-vswitchd: 106096516|rconn|INFO|xenbr0<->ssl:10.64.240.65:6633: connection timed out
Feb 11 19:40:29 vm2 ovs-vswitchd: 106096517|rconn|INFO|xenbr0<->ssl:10.64.240.65:6633: waiting 2 seconds before reconnect
Feb 11 19:40:29 vm2 ovs-vswitchd: 106096518|rconn|INFO|xenbr1<->ssl:10.64.240.65:6633: connection timed out
Feb 11 19:40:29 vm2 ovs-vswitchd: 106096519|rconn|INFO|xenbr1<->ssl:10.64.240.65:6633: waiting 2 seconds before reconnect
Feb 11 19:40:29 vm2 ovs-vswitchd: 106096520|rconn|INFO|xenbr2<->ssl:10.64.240.65:6633: connection timed out
Feb 11 19:40:29 vm2 ovs-vswitchd: 106096521|rconn|INFO|xenbr2<->ssl:10.64.240.65:6633: waiting 2 seconds before reconnect
Feb 11 19:40:29 vm2 ovs-vswitchd: 106096522|rconn|INFO|xenbr3<->ssl:10.64.240.65:6633: connection timed out
Feb 11 19:40:29 vm2 ovs-vswitchd: 106096523|rconn|INFO|xenbr3<->ssl:10.64.240.65:6633: waiting 2 seconds before reconnect
Feb 11 19:40:31 vm2 ovs-vswitchd: 106096528|rconn|INFO|xenbr0<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:31 vm2 ovs-vswitchd: 106096529|rconn|INFO|xenbr1<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:31 vm2 ovs-vswitchd: 106096530|rconn|INFO|xenbr2<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:31 vm2 ovs-vswitchd: 106096531|rconn|INFO|xenbr3<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:32 vm2 ovs-vswitchd: 106096532|fail_open|WARN|Could not connect to controller (or switch failed controller's post-connection admission control policy) for 15 seconds, failing open
Feb 11 19:40:32 vm2 ovs-vswitchd: 106096533|fail_open|WARN|Could not connect to controller (or switch failed controller's post-connection admission control policy) for 15 seconds, failing open
Feb 11 19:40:32 vm2 ovsdb-server: 2283750|reconnect|INFO|ssl:10.64.240.65:6632: connecting...
Feb 11 19:40:32 vm2 kernel:  connection1:0: detected conn error (1020)
Feb 11 19:40:33 vm2 iscsid: Kernel reported iSCSI connection 1:0 error (1020) state (3)
Feb 11 19:40:33 vm2 kernel:  connection3:0: detected conn error (1020)
Feb 11 19:40:33 vm2 ovs-vswitchd: 106096534|rconn|INFO|xenbr0<->ssl:10.64.240.65:6633: connection timed out
Feb 11 19:40:33 vm2 ovs-vswitchd: 106096535|rconn|INFO|xenbr0<->ssl:10.64.240.65:6633: waiting 4 seconds before reconnect
Feb 11 19:40:33 vm2 ovs-vswitchd: 106096536|rconn|INFO|xenbr1<->ssl:10.64.240.65:6633: connection timed out
Feb 11 19:40:33 vm2 ovs-vswitchd: 106096537|rconn|INFO|xenbr1<->ssl:10.64.240.65:6633: waiting 4 seconds before reconnect
Feb 11 19:40:33 vm2 ovs-vswitchd: 106096538|rconn|INFO|xenbr2<->ssl:10.64.240.65:6633: connection timed out
Feb 11 19:40:33 vm2 ovs-vswitchd: 106096539|rconn|INFO|xenbr2<->ssl:10.64.240.65:6633: waiting 4 seconds before reconnect
Feb 11 19:40:33 vm2 ovs-vswitchd: 106096540|rconn|INFO|xenbr3<->ssl:10.64.240.65:6633: connection timed out
Feb 11 19:40:33 vm2 ovs-vswitchd: 106096541|rconn|INFO|xenbr3<->ssl:10.64.240.65:6633: waiting 4 seconds before reconnect
Feb 11 19:40:33 vm2 multipathd: sdd: readsector0 checker reports path is down
Feb 11 19:40:33 vm2 multipathd: checker failed path 8:48 in map 20000000000000000000b5600c5684e10
Feb 11 19:40:33 vm2 multipathd: Path event for 20000000000000000000b5600c5684e10, calling mpathcount
Feb 11 19:40:33 vm2 kernel: device-mapper: multipath: Failing path 8:48.
Feb 11 19:40:33 vm2 fe: 21073 (/opt/xensource/libexec/block_device_io -device /dev/VG_XenStorage-2f64491e-f3...) exitted with signal -7
Feb 11 19:40:33 vm2 multipathd: 20000000000000000000b5600c5684e10: remaining active paths: 2
Feb 11 19:40:34 vm2 iscsid: Kernel reported iSCSI connection 3:0 error (1020) state (3)
Feb 11 19:40:36 vm2 ovsdb-server: 2283751|reconnect|INFO|ssl:10.64.240.65:6632: connection attempt timed out
Feb 11 19:40:36 vm2 ovsdb-server: 2283752|reconnect|INFO|ssl:10.64.240.65:6632: waiting 8 seconds before reconnect
Feb 11 19:40:37 vm2 iscsid: connection1:0 is operational after recovery (1 attempts)
Feb 11 19:40:37 vm2 iscsid: connection3:0 is operational after recovery (1 attempts)
Feb 11 19:40:37 vm2 ovs-vswitchd: 106096550|rconn|INFO|xenbr0<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:37 vm2 ovs-vswitchd: 106096551|rconn|INFO|xenbr1<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:37 vm2 ovs-vswitchd: 106096552|rconn|INFO|xenbr2<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:37 vm2 ovs-vswitchd: 106096553|rconn|INFO|xenbr3<->ssl:10.64.240.65:6633: connecting...
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096558|rconn|INFO|xenbr0<->ssl:10.64.240.65:6633: connected
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096559|rconn|INFO|xenbr1<->ssl:10.64.240.65:6633: connected
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096560|rconn|INFO|xenbr2<->ssl:10.64.240.65:6633: connected
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096561|rconn|INFO|xenbr3<->ssl:10.64.240.65:6633: connected
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096562|fail_open|WARN|No longer in fail-open mode
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096566|fail_open|WARN|No longer in fail-open mode
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096567|fail_open|WARN|No longer in fail-open mode
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096568|ofp_util|INFO|normalization changed ofp_match, details:
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096569|ofp_util|INFO| pre: wildcards=0xffffffff  in_port=    0  dl_src=00:00:30:19:da:02  dl_dst=00:00:00:00:00:00  dl_vlan=    0  dl_vlan_pcp=  0  dl_type=     0  nw_tos=0x50  nw_proto=0xc6  nw_src=         0  nw_dst=         0  tp_src=    0  tp_dst=    0
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096570|ofp_util|INFO|post: wildcards= 0x23fffff  in_port=    0  dl_src=00:00:00:00:00:00  dl_dst=00:00:00:00:00:00  dl_vlan=    0  dl_vlan_pcp=  0  dl_type=     0  nw_tos=   0  nw_proto=   0  nw_src=         0  nw_dst=         0  tp_src=    0  tp_dst=    0
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096571|ofp_util|INFO|normalization changed ofp_match, details:
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096572|ofp_util|INFO| pre: wildcards=  0x3ffffe  in_port=65535  dl_src=5e:8e:38:74:52:7e  dl_dst=5e:8e:38:74:52:7e  dl_vlan=    0  dl_vlan_pcp=  0  dl_type= 0x800  nw_tos=   0  nw_proto=0x11  nw_src= 0xa40fc30  nw_dst= 0xa40fc30  tp_src=   53  tp_dst=   53
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096573|ofp_util|INFO|post: wildcards=  0x3ffffe  in_port=65535  dl_src=00:00:00:00:00:00  dl_dst=00:00:00:00:00:00  dl_vlan=    0  dl_vlan_pcp=  0  dl_type=     0  nw_tos=   0  nw_proto=   0  nw_src=         0  nw_dst=         0  tp_src=    0  tp_dst=    0
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096574|ofp_util|INFO|normalization changed ofp_match, details:
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096575|ofp_util|INFO| pre: wildcards=  0x3ffffe  in_port=65534  dl_src=5e:8e:38:74:52:7e  dl_dst=5e:8e:38:74:52:7e  dl_vlan=    0  dl_vlan_pcp=  0  dl_type= 0x800  nw_tos=   0  nw_proto=0x11  nw_src= 0xa40fc30  nw_dst= 0xa40fc30  tp_src=   53  tp_dst=   53
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096576|ofp_util|INFO|post: wildcards=  0x3ffffe  in_port=65534  dl_src=00:00:00:00:00:00  dl_dst=00:00:00:00:00:00  dl_vlan=    0  dl_vlan_pcp=  0  dl_type=     0  nw_tos=   0  nw_proto=   0  nw_src=         0  nw_dst=         0  tp_src=    0  tp_dst=    0
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096577|pktbuf|INFO|Dropped 12 log messages in last 52095 seconds due to excessive rate
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096578|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096579|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096580|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096581|fail_open|WARN|No longer in fail-open mode
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096585|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096586|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096587|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096588|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096589|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096590|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096591|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096592|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096593|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096594|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096595|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096596|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096597|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096598|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096599|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096600|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:38 vm2 ovs-vswitchd: 106096601|pktbuf|INFO|Received null cookie ffffff00 (this is normal if the switch was recently in fail-open mode)
Feb 11 19:40:44 vm2 multipathd: sdd: readsector0 checker reports path is up
Feb 11 19:40:44 vm2 multipathd: 8:48: reinstated
Feb 11 19:40:44 vm2 multipathd: 20000000000000000000b5600c5684e10: remaining active paths: 3
Feb 11 19:40:44 vm2 multipathd: Path event for 20000000000000000000b5600c5684e10, calling mpathcount
Feb 11 19:40:44 vm2 ovsdb-server: 2283753|reconnect|INFO|ssl:10.64.240.65:6632: connecting...
Feb 11 19:40:44 vm2 ovsdb-server: 2283756|reconnect|INFO|ssl:10.64.240.65:6632: connected
Feb 11 19:40:44 vm2 block_device_io: [ info|vm2|0||block_device_io] Opened block device.
Feb 11 19:40:44 vm2 block_device_io: [ info|vm2|0||block_device_io] Accepted connection on data socket
Feb 11 19:40:44 vm2 block_device_io: [ info|vm2|0||block_device_io] Closing connection on data socket
Feb 11 19:43:50 vm2 interface-reconfigure: Called as /opt/xensource/libexec/interface-reconfigure rewrite
Feb 11 19:43:50 vm2 interface-reconfigure: No session ref given on command line, logging in.
Feb 11 19:43:50 vm2 interface-reconfigure: host uuid is 7ab461b5-a01d-42ab-bfb2-b55ff44abbcb
Feb 11 19:43:50 vm2 interface-reconfigure: Called as /opt/xensource/libexec/interface-reconfigure rewrite
Feb 11 19:43:50 vm2 interface-reconfigure: No session ref given on command line, logging in.
Feb 11 19:43:50 vm2 interface-reconfigure: host uuid is 7ab461b5-a01d-42ab-bfb2-b55ff44abbcb
Feb 11 19:43:51 vm2 interface-reconfigure: Running command: /usr/bin/ovs-vsctl --timeout=20 -- br-set-external-id xenbr0 xs-network-uuids 41f2006a-6dc4-6fa3-b8a5-0a4fbd2bb783 -- br-set-external-id xenbr2 xs-network-uuids 0391a257-ae66-404d-e5fd-6941ed5909c8;d1a1d986-356d-f917-4215-312f8921eaff;d4d5aceb-2116-5cbe-e4a9-1c94c1cefa56;1121391a-21b5-abe5-184c-0f38b6b2de55;d5f4a3a3-c017-9e23-403b-8524f0685caf;a2fe2583-7a7e-2339-cedf-ac5718259b90 -- br-set-external-id xenbr1 xs-network-uuids 9b2465e7-86bd-f39d-ae06-08d10e2b01c2 -- br-set-external-id xenbr3 xs-network-uuids 3ff864b1-30b0-2203-3ac8-e4550968df29
Feb 11 19:43:51 vm2 ovs-vsctl: 00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl --timeout=20 -- br-set-external-id xenbr0 xs-network-uuids 41f2006a-6dc4-6fa3-b8a5-0a4fbd2bb783 -- br-set-external-id xenbr2 xs-network-uuids 0391a257-ae66-404d-e5fd-6941ed5909c8;d1a1d986-356d-f917-4215-312f8921eaff;d4d5aceb-2116-5cbe-e4a9-1c94c1cefa56;1121391a-21b5-abe5-184c-0f38b6b2de55;d5f4a3a3-c017-9e23-403b-8524f0685caf;a2fe2583-7a7e-2339-cedf-ac5718259b90 -- br-set-external-id xenbr1 xs-network-uuids 9b2465e7-86bd-f39d-ae06-08d10e2b01c2 -- br-set-external-id xenbr3 xs-network-uuids 3ff864b1-30b0-2203-3ac8-e4550968df29
Feb 11 19:43:51 vm2 interface-reconfigure: Unknown other-config attribute: cpuid_feature_mask
Feb 11 19:43:51 vm2 interface-reconfigure: Unknown other-config attribute: memory-ratio-hvm
Feb 11 19:43:51 vm2 interface-reconfigure: Unknown other-config attribute: memory-ratio-pv
Feb 11 19:43:51 vm2 interface-reconfigure: Unknown other-config attribute: mail-destination
Feb 11 19:43:51 vm2 interface-reconfigure: Unknown other-config attribute: ssmtp-mailhub

I'm at least looking for info as to what else I can check to find the cause of these outages ... and ideally, a solution to not having them!

xenserver
asked on Server Fault Jan 18, 2014 by ETL • edited Feb 13, 2014 by ETL

1 Answer

0

According to some people on xenserver.org this is a problem with Version 6.0 which is resolved by installing updates or upgrading to newer version.

answered on Server Fault Mar 5, 2014 by ETL

User contributions licensed under CC BY-SA 3.0