Server unreachable, best way to find out the cause?

1

I'm running debian squeeze on a rented dedicated server and in the recent time the server gets more often unreachable from one moment to the other with any external service.

During this downtime, the crontabs etc. are running normally and I couldn't find any clou of a crash or related in any logfiles.

To get the control back, I simply restart it through the web interface of my provider.

Regarding to this topic: Linux networking crash: best steps to find out the cause? I confronted my provider with this problem, but they couldn't find any problems with their NIC or the network card, additional they changed my server hardware completely (Except HDD).

How I could get closer to the source who causing these downtimes?

Sadly I have no access to the server when it's external unreachable, to make any tests.

While the server is unreachable "arp -na" returns "at < incomplete > at on eth0". (I made a simply crontab who checks this state) In the syslog I can't find any information related to this problem.

puck:/home# route -n
Kernel-IP-Routentabelle
Ziel            Router          Genmask         Flags Metric Ref    Use Iface
xx.xx.xxx.xxx   0.0.0.0         255.255.255.192 U     0      0        0 eth0
0.0.0.0         xx.xx.xxx.xxx   0.0.0.0         UG    0      0        0 eth0

puck:/home# arp -na
? (xx.xx.xxx.xxx) auf 00:00:5e:00:01:01 [ether] auf eth0

puck:/home# ethtool eth0
Settings for eth0:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Half 1000baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Half 1000baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Speed: 100Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 1
        Transceiver: external
        Auto-negotiation: on
        Supports Wake-on: g
        Wake-on: g
        Current message level: 0x000000ff (255)
        Link detected: yes

My interfaces:

auto lo
iface lo inet loopback

# ethernet interface

auto eth0
iface eth0 inet static
  address xxx.xxx.xxx.xxx
  network xxx.xxx.xxx.yyy
  netmask 255.255.255.yyy
  broadcast xxx.xxx.xxx.255
  gateway xxx.xxx.zzz.zzz

# virtual interfaces

auto eth0:1
iface eth0:1 inet static
address xxx.xxx.xxx.xxx
netmask 255.255.255.255

auto eth0:2
iface eth0:2 inet static
address xxx.xxx.xxx.xxx
netmask 255.255.255.255


auto eth0:3
iface eth0:3 inet static
address xxx.xxx.xxx.xxx
netmask 255.255.255.255
networking
ip
asked on Server Fault Jul 28, 2012 by heuri • edited Apr 13, 2017 by Community

1 Answer

1

try adding more cron jobs that run every minute and log:

  • fact that the job run [date>>log]
  • content of the arp table, ip configuration [arp -n >> log; ip a >> log]
  • state of the network interface [ethtool -i eth>>log]
  • log messages will not hurt you either [dmesg -c >>log]
  • result of ping to the router, ping to few 'neighbor' hosts from the same subnet.
  • force sync for a good measure

this should help you to establish if that's the whole machine that freezes, or just the networking problems and if so where do they start.

cam it be ip address conflict or even better case of mac duplicate in the same segment?

answered on Server Fault Jul 28, 2012 by pQd

User contributions licensed under CC BY-SA 3.0