How do I figure out which drive is failing?

-2

How do I relate this information with which physical drive is failing? It's a debian kernel.

Nov 21 18:06:00 IHPAC kernel: [594026.608042] ata5.00: status: { DRDY }
Nov 21 18:06:00 IHPAC kernel: [594026.787427] ata5.00: failed command: WRITE FPDMA QUEUED
Nov 21 18:06:00 IHPAC kernel: [594026.966505] ata5.00: cmd 61/00:e8:fb:b6:59/04:00:a2:00:00/40 tag 29 ncq 524288 out
Nov 21 18:06:00 IHPAC kernel: [594026.966508]          res 40/00:48:03:ef:59/00:00:a2:00:00/40 Emask 0x50 (ATA bus error)


IHPAC:~$ dmesg | grep ata5
[    6.291403] ata5: SATA max UDMA/133 abar m1024@0xfaffe400 port 0xfaffe600 irq 22
[    6.840145] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    6.840829] ata5.00: ATA-8: ST3000DM001-9YN166, CC9C, max UDMA/133
[    6.840832] ata5.00: 5860533168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[    6.841483] ata5.00: configured for UDMA/133
[59669.062886] ata5: exception Emask 0x10 SAct 0x0 SErr 0x90202 action 0xe frozen
[59669.066958] ata5: irq_stat 0x00400000, PHY RDY changed
[59669.069852] ata5: SError: { RecovComm Persist PHYRdyChg 10B8B }
[59669.073247] ata5: hard resetting link
[59675.560102] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[59675.561576] ata5.00: configured for UDMA/133
[59675.561589] ata5: EH complete
...
[421238.151794] ata5: exception Emask 0x10 SAct 0x0 SErr 0x90202 action 0xe frozen   \
[421238.155912] ata5: irq_stat 0x00400000, PHY RDY changed                            \
[421238.158854] ata5: SError: { RecovComm Persist PHYRdyChg 10B8B }                    |
[421238.162302] ata5: hard resetting link                                              | Repeats 5 times
[421244.650101] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)                 |
[421244.651513] ata5.00: configured for UDMA/133                                      /
[421244.651525] ata5: EH complete                                                    /
...
[593676.000793] ata5.00: exception Emask 0x50 SAct 0x7fffffff SErr 0x90a02 action 0xe frozen
[593676.130479] ata5.00: irq_stat 0x00400000, PHY RDY changed
[593676.259877] ata5: SError: { RecovComm Persist HostInt PHYRdyChg 10B8B }
[593676.388864] ata5.00: failed command: WRITE FPDMA QUEUED                             \
[593676.513825] ata5.00: cmd 61/e0:00:ab:ac:30/01:00:9d:00:00/40 tag 0 ncq 245760 out    | Repeats MANY times
[593676.750610] ata5.00: status: { DRDY }                                               /
...
[593697.436610] ata5: hard resetting link
[593698.380128] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[593698.382682] ata5.00: configured for UDMA/133
[593698.382883] ata5: EH complete
[594005.248408] ata5.00: exception Emask 0x50 SAct 0x7fffffff SErr 0x90a02 action 0xe frozen
[594005.429802] ata5.00: irq_stat 0x00400000, PHY RDY changed
[594005.610614] ata5: SError: { RecovComm Persist HostInt PHYRdyChg 10B8B }
[594005.791306] ata5.00: failed command: WRITE FPDMA QUEUED                            \
[594005.972202] ata5.00: cmd 61/00:00:fb:8a:59/04:00:a2:00:00/40 tag 0 ncq 524288 out   | Repeats MANY times
[594006.337349] ata5.00: status: { DRDY }                                              /
...
[594028.228309] ata5: hard resetting link
[594029.170095] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[594029.173073] ata5.00: configured for UDMA/133
[594029.173367] ata5: EH complete
hard-drive
asked on Server Fault Nov 22, 2013 by boatcoder • edited Nov 22, 2013 by boatcoder

2 Answers

6

I searched the internet for you and found this script as the second answer

for x in /sys/block/sd*
do
dev=$(basename $x)
host=$(ls -l $x | egrep -o "host[0-9]+")
target=$(ls -l $x | egrep -o "target[0-9:]*")
a=$(cat /sys/class/scsi_host/$host/unique_id)
a2=$(echo $target | egrep -o "[0-9]:[0-9]$" | sed 's/://')
serial=$(hdparm -I /dev/$dev | grep "Serial Number" | sed 's/^[ \t]*//')
echo -e "$dev \t ata$a.$a2 \t $serial"
done
answered on Server Fault Nov 22, 2013 by user9517
0

Smartctl will shows you the serial numbers of the drives present. dmesg should also have the serial numbers of the other disks.

If you are only using a single sATA controller (not a sATA board plus on-board), then the ata* will map to the port generally.

The whole dmesg would be best, but I'd assume it is connector 5 on the board.

answered on Server Fault Nov 22, 2013 by Jon Brauer

User contributions licensed under CC BY-SA 3.0