Reading SMART results for failing disk

0

I recently started having trouble with my Dell laptop and would appreciate any recommendations on the next steps for my issue. I have portions of the dmesg log below to show the errors I'm getting.

My laptop has 6 GB RAM and 1 TB convential hard drive (no SSD) running either Ubuntu 12.10 Quantal. Every so often (maybe 1-2x each week) I noticed my computer wouldn't behave as expected, so as we do in the Windows world, I rebooted after a couple initial troubleshooting steps.

What was weird about this was upon boot-up, my boot screen would report fsck errors and prompt me to continue to fix; I would answer yes, it would repair, and I would boot up, log-in, and keep going.

Now it seems to have gotten a bit worse. My root partition is continuously being mounted read-only, so I can't really do much once I appear to successfully boot-up and log-in.

I have now booted into a live CD environment via PXE boot in order to troubleshoot the hard drive. Here are my observations.

First, the root partition is encrypted; when I boot-up normally, I'm prompted for the passphrase to decrypt; after that, /dev/sda5 is accessible as something like /dev/mapper/ubuntu-root and can be mounted.

# file -sL /dev/sda5
/dev/sda5: LUKS encrypted file, ver 1 ...

My /boot partition is /dev/sda1 formatted as ext2:

# fsck -N /dev/sda1
fsck from util-linux 2.20.1
[/sbin/fsck.ext2 (1) -- /dev/sda1] fsck.ext2 /dev/sda1 

I can decrypt my root partition manually from the live CD environment:

# cryptsetup luksOpen /dev/sda5 ubuntu-root
Enter passphrase for /dev/sda5:

After that, I thought I could run e2fsck, but apparently not this time:

# e2fsck /dev/mapper/ubuntu-root

I found this page.

Following some of these steps, I ran lvmdiskscan, lvdisplay, and vgdisplay for /dev/ubuntu/root. I couldn't mount /dev/ubuntu/root as stated because there was no root in /dev/ubuntu.

I ran lvscan and got:

# lvscan
inactive          '/dev/ubuntu/root' [925.32 GiB] inherit
ACTIVE            '/dev/ubuntu/swap_1' [5.91 GiB] inherit

# modprobe dm-mod
# vgchange -ay
device-mapper: create ioctl on ubuntu-root failed: Device or resource busy
1 logical volume(s) in volume group "ubuntu" now active

I wasn't aware of LVM being part of the mix; previously, I was decrypting and then trying to mount /dev/mapper/ubuntu-root directly to some place like /mnt/mountpoint.

I think I also tried mount -o remount,rw to try to get it to mount read-only. When I did this, I observed that e2fsck would report errors on the partition.

This time, my launcher doesn't show the drive listed, so I can't click on it and "eject" it; it does have /dev/sda1, but I'm not worried about that.

EDIT: I rebooted and came back in the live CD environment. This time, I used the launcher icon to decrypt the drive, then unmounted. When I ran e2fsck, it now shows up clean.

EDIT: Running lvscan now gives me:

# lvscan
ACTIVE            '/dev/ubuntu/root' [925.32 GiB] inherit
ACTIVE            '/dev/ubuntu/swap_1' [5.91 GiB] inherit

I realized something. I'm not 100% clear on how the root partition is encrypted, but it feels like LUKS on top of LVM the /dev/sda5 device. If so, then is it safe for me to run e2fsck directly on /dev/mapper/ubuntu-root?

I feel like I should run some sort of LVM checker on ubuntu-root, and I should run e2fsck on whatever the LVM volume group or logical volume ends up being, which shouldn't be /dev/ubuntu/root. Both /dev/mapper/ubuntu-root and /dev/ubuntu/root are symlinks to /dev/dm-1. However, according to this, I should have no problem running e2fsck on /dev/ubuntu/root.

I have now rebooted into the live CD environment at least twice and consistently gotten a clean report from e2fsck when running on my root partition. Next, I'm going to try to go back into the regular environment to see if it still mounts read-only.

I've had this laptop since 2012 when it was new, so it's still a bit young, but it's possible the hard drive itself might be close to the end of its life.

EDIT: Here is the dmesg output I currently see when I boot up with the hard drive. Does this mean my hard drive is physically failing? Is there something I can run to confirm or to repair, or is it time to go out and buy a replacement?

[    2.311573] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[    2.311686] sd 0:0:0:0: [sda] Write Protect is off
[    2.311691] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    2.311724] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    2.356501]  sda: sda1 sda2 < sda5 >
[    2.357377] sd 0:0:0:0: [sda] Attached SCSI disk
...
[   51.070290] sd 0:0:0:0: [sda] Unhandled sense code
[   51.070292] sd 0:0:0:0: [sda]  
[   51.070294] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[   51.070296] sd 0:0:0:0: [sda]  
[   51.070297] Sense Key : Medium Error [current] [descriptor]
[   51.070300] Descriptor sense data with sense descriptors (in hex):
[   51.070302]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
[   51.070310]         44 69 fa 00 
[   51.070314] sd 0:0:0:0: [sda]  
[   51.070317] Add. Sense: Unrecovered read error - auto reallocate failed
[   51.070319] sd 0:0:0:0: [sda] CDB: 
[   51.070320] Read(10): 28 00 44 69 fa 00 00 00 08 00
[   51.070327] end_request: I/O error, dev sda, sector 1147795968
[   51.070352] ata1: EH complete
[   81.970261] ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen
[   81.970273] ata1.00: failed command: READ FPDMA QUEUED
[   81.970286] ata1.00: cmd 60/08:00:f0:c1:c7/00:00:2b:00:00/40 tag 0 ncq 4096 in
[   81.970286]          res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[   81.970292] ata1.00: status: { DRDY }
[   81.970297] ata1.00: failed command: READ FPDMA QUEUED
[   81.970316] ata1.00: cmd 60/08:08:08:c1:c7/00:00:2b:00:00/40 tag 1 ncq 4096 in
[   81.970316]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[   81.970318] ata1.00: status: { DRDY }
[   81.970319] ata1.00: failed command: READ FPDMA QUEUED
[   81.970323] ata1.00: cmd 60/08:10:00:fa:69/00:00:44:00:00/40 tag 2 ncq 4096 in
[   81.970323]          res 40/00:00:00:fa:69/00:00:44:00:00/40 Emask 0x4 (timeout)
[   81.970325] ata1.00: status: { DRDY }
[   81.970326] ata1.00: failed command: READ FPDMA QUEUED
[   81.970330] ata1.00: cmd 60/08:18:10:c1:c7/00:00:2b:00:00/40 tag 3 ncq 4096 in
[   81.970330]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[   81.970332] ata1.00: status: { DRDY }
[   81.970333] ata1.00: failed command: READ FPDMA QUEUED
[   81.970337] ata1.00: cmd 60/08:20:18:c1:c7/00:00:2b:00:00/40 tag 4 ncq 4096 in
[   81.970337]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[   81.970339] ata1.00: status: { DRDY }
[   81.970340] ata1.00: failed command: READ FPDMA QUEUED
[   81.970344] ata1.00: cmd 60/08:28:20:c1:c7/00:00:2b:00:00/40 tag 5 ncq 4096 in
[   81.970344]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[   81.970346] ata1.00: status: { DRDY }
...
[   81.970512] ata1.00: failed command: READ FPDMA QUEUED
[   81.970516] ata1.00: cmd 60/08:f0:e8:c1:c7/00:00:2b:00:00/40 tag 30 ncq 4096 in
[   81.970516]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[   81.970518] ata1.00: status: { DRDY }
[   81.970521] ata1: hard resetting link
[   82.298073] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   82.304897] ata1.00: ACPI cmd 00/00:00:00:00:00:a0 (NOP) rejected by device (Stat=0x51 Err=0x04)
[   82.311832] ata1.00: ACPI cmd 00/00:00:00:00:00:a0 (NOP) rejected by device (Stat=0x51 Err=0x04)
[   82.312023] ata1.00: configured for UDMA/133
[   82.326050] ata1.00: device reported invalid CHS sector 0
[   82.326060] ata1.00: device reported invalid CHS sector 0
[   82.326066] ata1.00: device reported invalid CHS sector 0
[   82.326070] ata1.00: device reported invalid CHS sector 0
[   82.326074] ata1.00: device reported invalid CHS sector 0
...
[   82.326149] ata1.00: device reported invalid CHS sector 0
[   82.326151] ata1.00: device reported invalid CHS sector 0
[   82.334045] sd 0:0:0:0: [sda]  
[   82.334055] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[   82.334060] sd 0:0:0:0: [sda]  
[   82.334064] Sense Key : Aborted Command [current] [descriptor]
[   82.334072] Descriptor sense data with sense descriptors (in hex):
[   82.334076]         72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 
[   82.334100]         00 00 00 00 
[   82.334110] sd 0:0:0:0: [sda]  
[   82.334122] Add. Sense: No additional sense information
[   82.334124] sd 0:0:0:0: [sda] CDB: 
[   82.334125] Read(10): 28 00 2b c7 c1 f0 00 00 08 00
[   82.334133] end_request: I/O error, dev sda, sector 734511600
[   82.334151] sd 0:0:0:0: [sda]  
[   82.334153] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[   82.334154] sd 0:0:0:0: [sda]  
[   82.334155] Sense Key : Aborted Command [current] [descriptor]
[   82.334157] Descriptor sense data with sense descriptors (in hex):
[   82.334158]         72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 
[   82.334166]         00 00 00 00 
[   82.334169] sd 0:0:0:0: [sda]  
[   82.334171] Add. Sense: No additional sense information
[   82.334173] sd 0:0:0:0: [sda] CDB: 
[   82.334174] Read(10): 28 00 2b c7 c1 08 00 00 08 00
[   82.334180] end_request: I/O error, dev sda, sector 734511368
[   82.334187] sd 0:0:0:0: [sda]  
...
[   82.335602] Sense Key : Aborted Command [current] [descriptor]
[   82.335605] Descriptor sense data with sense descriptors (in hex):
[   82.335606]         72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 
[   82.335619]         00 00 00 00 
[   82.335624] sd 0:0:0:0: [sda]  
[   82.335627] Add. Sense: No additional sense information
[   82.335630] sd 0:0:0:0: [sda] CDB: 
[   82.335631] Read(10): 28 00 2b c7 c1 e8 00 00 08 00
[   82.335641] end_request: I/O error, dev sda, sector 734511592
[   82.335649] ata1: EH complete
[  142.882970] ata1.00: exception Emask 0x0 SAct 0x1ffff SErr 0x0 action 0x6 frozen
[  142.882983] ata1.00: failed command: READ FPDMA QUEUED
[  142.882996] ata1.00: cmd 60/08:00:40:41:4d/00:00:37:00:00/40 tag 0 ncq 4096 in
[  142.882996]          res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[  142.883002] ata1.00: status: { DRDY }
[  142.883007] ata1.00: failed command: READ FPDMA QUEUED
[  142.883018] ata1.00: cmd 60/08:08:00:fa:69/00:00:44:00:00/40 tag 1 ncq 4096 in
[  142.883018]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  142.883023] ata1.00: status: { DRDY }
[  142.883027] ata1.00: failed command: READ FPDMA QUEUED
[  142.883038] ata1.00: cmd 60/08:10:90:c8:48/00:00:26:00:00/40 tag 2 ncq 4096 in
[  142.883038]          res 40/00:00:00:fa:69/00:00:44:00:00/40 Emask 0x4 (timeout)
[  142.883043] ata1.00: status: { DRDY }
...
[  142.883283] ata1.00: status: { DRDY }
[  142.883288] ata1.00: failed command: WRITE FPDMA QUEUED
[  142.883298] ata1.00: cmd 61/58:78:e0:f5:08/00:00:00:00:00/40 tag 15 ncq 45056 out
[  142.883298]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  142.883303] ata1.00: status: { DRDY }
[  142.883308] ata1.00: failed command: WRITE FPDMA QUEUED
[  142.883318] ata1.00: cmd 61/40:80:a8:30:4f/00:00:28:00:00/40 tag 16 ncq 32768 out
[  142.883318]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  142.883323] ata1.00: status: { DRDY }
[  142.883333] ata1: hard resetting link
[  143.202862] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  143.209830] ata1.00: ACPI cmd 00/00:00:00:00:00:a0 (NOP) rejected by device (Stat=0x51 Err=0x04)
[  143.217063] ata1.00: ACPI cmd 00/00:00:00:00:00:a0 (NOP) rejected by device (Stat=0x51 Err=0x04)
[  143.217251] ata1.00: configured for UDMA/133
[  143.230840] ata1.00: device reported invalid CHS sector 0
[  143.230848] ata1.00: device reported invalid CHS sector 0
[  143.230852] ata1.00: device reported invalid CHS sector 0
[  143.230866] ata1.00: device reported invalid CHS sector 0
...
[  143.232581] sd 0:0:0:0: [sda]  
[  143.232583] Add. Sense: No additional sense information
[  143.232584] sd 0:0:0:0: [sda] CDB: 
[  143.232585] Write(10): 2a 00 28 4f 30 a8 00 00 40 00
[  143.232591] end_request: I/O error, dev sda, sector 676278440
[  143.232597] Buffer I/O error on device dm-1, logical block 84471317
[  143.232599] Buffer I/O error on device dm-1, logical block 84471318
[  143.232601] Buffer I/O error on device dm-1, logical block 84471319
[  143.232602] Buffer I/O error on device dm-1, logical block 84471320
[  143.232604] Buffer I/O error on device dm-1, logical block 84471321
[  143.232606] Buffer I/O error on device dm-1, logical block 84471322
[  143.232607] Buffer I/O error on device dm-1, logical block 84471323
[  143.232609] Buffer I/O error on device dm-1, logical block 84471324
[  143.232611] EXT4-fs warning (device dm-1): ext4_end_bio:250: I/O error writing to inode 19529846 (offset 0 size 32768 starting block 84471317)
[  143.232615] ata1: EH complete
[  143.256074] JBD2: Spotted dirty metadata buffer (dev = dm-1, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
[  143.256155] journal commit I/O error
[  143.256165] journal commit I/O error
[  143.256612] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #20054519: comm zeitgeist-datah: reading directory lblock 0
[  143.256618] EXT4-fs (dm-1): Remounting filesystem read-only
[  143.268860] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #26214788: comm nautilus: reading directory lblock 0
[  143.268862] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #19529818: comm compiz: reading directory lblock 0
[  143.270777] EXT4-fs error (device dm-1): ext4_journal_start_sb:370: Detected aborted journal
[  143.270821] EXT4-fs error (device dm-1): ext4_journal_start_sb:370: Detected aborted journal
[  143.232609] Buffer I/O error on device dm-1, logical block 84471324
[  143.232611] EXT4-fs warning (device dm-1): ext4_end_bio:250: I/O error writing to inode 19529846 (offset 0 size 32768 starting block 84471317)
[  143.232615] ata1: EH complete
[  143.256074] JBD2: Spotted dirty metadata buffer (dev = dm-1, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
[  143.256155] journal commit I/O error
[  143.256165] journal commit I/O error
[  143.256612] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #20054519: comm zeitgeist-datah: reading directory lblock 0
[  143.256618] EXT4-fs (dm-1): Remounting filesystem read-only
[  143.268860] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #26214788: comm nautilus: reading directory lblock 0
[  143.268862] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #19529818: comm compiz: reading directory lblock 0
[  143.270777] EXT4-fs error (device dm-1): ext4_journal_start_sb:370: Detected aborted journal
[  143.270821] EXT4-fs error (device dm-1): ext4_journal_start_sb:370: Detected aborted journal
[  143.270826] EXT4-fs error (device dm-1): ext4_journal_start_sb:370: Detected aborted journal
[  195.831825] ata1.00: exception Emask 0x0 SAct 0x3fff SErr 0x0 action 0x6 frozen
[  195.831835] ata1.00: failed command: READ FPDMA QUEUED
[  195.831844] ata1.00: cmd 60/60:00:20:5e:ce/00:00:36:00:00/40 tag 0 ncq 49152 in
[  195.831844]          res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[  195.831848] ata1.00: status: { DRDY }
[  195.831852] ata1.00: failed command: READ FPDMA QUEUED
[  195.831859] ata1.00: cmd 60/90:08:e8:f6:de/00:00:36:00:00/40 tag 1 ncq 73728 in
[  195.831859]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  195.831862] ata1.00: status: { DRDY }
...
[  196.196650] Add. Sense: No additional sense information
[  196.196653] sd 0:0:0:0: [sda] CDB: 
[  196.196655] Read(10): 28 00 36 d7 9c 98 00 00 20 00
[  196.196666] end_request: I/O error, dev sda, sector 920099992
[  196.196694] ata1: EH complete
[  196.219147] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #28970197: comm ubuntuone-launc: reading directory lblock 0
[  196.226652] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #29229096: comm kworker/u:5: reading directory lblock 0
[  196.226698] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #19529749: comm zeitgeist-datah: reading directory lblock 0
[  196.227074] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #30411827: comm compiz: reading directory lblock 0
[  196.227132] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #20185725: comm gnome-settings-: reading directory lblock 0
[  196.227199] EXT4-fs error (device dm-1): ext4_find_entry:1209: inode #19529825: comm nautilus: reading directory lblock 0
[  196.250118] JBD2: Detected IO errors while flushing file data on dm-1-8

EDIT: Now that all the primary data have been transferred over, I thought I'd try taking a VM as well for convenience so I don't have to rebuild it. I was pleasantly surprised when rsync reported an error and then tried again automatically to transfer the damaged file a second time:

rsync: read errors mapping "/media/ubuntu/423e7378-a121-c057-63ab-224c92293d6b/opt/vmware/vm2/vm2-s016.vmdk": Input/output error (5)
...
WARNING: vmware/vm2/vm2-s016.vmdk failed verification -- update discarded (will try again).
vmware/vm2/vm2-s016.vmdk
  1218576384  57%    7.66MB/s    0:01:56  
...
ERROR: vmware/Win7VPN/Win7-s016.vmdk failed verification -- update discarded.
...
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1070) [sender=3.0.9]

Not a big loss.

I have now installed Smartmontools via the Ubuntu 13.10 live CD environment in order to run the tests on the hard drive using this page as a guide:

# apt-get install smartmontools
# smartctl --info /dev/sda
# smartctl --capabilities /dev/sda

Health looks like it is currently passed before running any tests:

# smartctl --health /dev/sda
smartctl 6.2 2013-04-20 r3812 [i686-linux-3.11.0-12-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Since I salvaged all the data I want, the hard drive isn't being used so I decided to go for foreground (captive) mode.

# smartctl --captive --test=short /dev/sda
smartctl 6.2 2013-04-20 r3812 [i686-linux-3.11.0-12-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in captive mode".
Drive command "Execute SMART Short self-test routine immediately in captive mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Tue Dec 30 14:10:54 2014

After two minutes, I viewed the results:

# smartctl --log=selftest /dev/sda
smartctl 6.2 2013-04-20 r3812 [i686-linux-3.11.0-12-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short captive       Completed: read failure       90%      4535         890139008
# 2  Short offline       Completed without error       00%         0         -

Next test:

# smartctl --captive --test=long /dev/sda
smartctl 6.2 2013-04-20 r3812 [i686-linux-3.11.0-12-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in captive mode".
Drive command "Execute SMART Extended self-test routine immediately in captive mode" successful.
Testing has begun.
Please wait 221 minutes for test to complete.
Test will complete after Tue Dec 30 17:55:17 2014

I can post the results after three hours.

This command was also mentioned on the page I referenced.

# smartctl --all /dev/sda

It looks like it prints everything of interest at once. It looks like it might've interrupted my long test, so I sent the command to re-run. I'm unclear whether it has no effect when the test is already running, or if it restarts the test from the beginning.

EDIT: I ran the extended test. After 221 minutes, I looked at the log. It looks like it didn't run to completion for some reason. sigh

# smartctl --log=selftest /dev/sda
smartctl 6.2 2013-04-20 r3812 [i686-linux-3.11.0-12-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Offline             Interrupted (host reset)      90%      4535         -
# 2  Extended captive    Self-test routine in progress 90%      4535         -
# 3  Offline             Interrupted (host reset)      90%      4535         -
# 4  Extended captive    Self-test routine in progress 90%      4535         -
# 5  Offline             Interrupted (host reset)      90%      4535         -
# 6  Extended captive    Self-test routine in progress 90%      4535         -
# 7  Short captive       Completed: read failure       90%      4535         890139008
# 8  Short offline       Completed without error       00%         0         -

It would be nice to be able to monitor the test running somehow without interrupting it. I don't know if I did something to cause it to interrupt; it's been sitting here for the past four hours.

I tried the conveyance test now, but this doesn't give me any idea of how long it'll take to run:

# smartctl --captive --test=conveyance /dev/sda
smartctl 6.2 2013-04-20 r3812 [i686-linux-3.11.0-12-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Conveyance Self-test functions not supported

Sending command: "Execute SMART Conveyance self-test routine immediately in captive mode".
Drive command "Execute SMART Conveyance self-test routine immediately in captive mode" successful.
Testing has begun.

What does all this tell me about my disk? Can it be salvaged? Can I reformat or check for bad sectors, maybe mark somehow to avoid using the damaged area(s)?

In the past, I've simply put the disk aside and replaced with a new one, but I'm more curious this time to understand the output better to know what else can be done.

Also does anyone have any recommendations on how frequently to run the short and the long tests? What about the conveyance test or other tests? When/how often should those be run?

Thanks!

linux
hard-drive
rsync
smart
asked on Super User Dec 29, 2014 by jia103 • edited Dec 30, 2014 by jia103

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0