Verify that a disk is bad and that it needs to be replaced.
zpool status
returns:
pool: testpool state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-4J scan: scrub repaired 0 in 2h4m with 0 errors on Sun Jun 9 00:28:24 2013 config: NAME STATE READ WRITE CKSUM testpool DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 ata-ST3300620A_5QF0MJFP ONLINE 0 0 0 ata-ST3300831A_5NF0552X UNAVAIL 0 0 0 ata-ST3200822A_5LJ1CHMS ONLINE 0 0 0 ata-ST3200822A_3LJ0189C ONLINE 0 0 0 errors: No known data errors
NOTE: This shows that one disk is unavailable.
NOTE: The new disk is ata-ST3500320AS_9QM03ATQ.
zpool replace testpool /dev/disk/by-id/ata-ST3300831A_5NF0552X /dev/disk/by-id/ata-ST3500320AS_9QM03ATQ zpool offline testpool /dev/disk/by-id/ata-ST3300831A_5NF0552X zpool detatch testpool /dev/disk/by-id/ata-ST3300831A_5NF0552X
NOTE: Here the old device is specified first followed by the new device.
If the bad device has already been removed from the system, this might fail with the following error.
cannot offline /dev/disk/by-id/ata-ST3300831A_5NF0552X: no such device in pool
There are various ways to determine a GUID:
zdb # Find GUID. zdb -l /dev/sda1 # In case the 'zdb' command does not work. zpool status -g # Find GUID. zpool status -L # Find device name, resolving links.
Try to get the GUID using zdb:
zdb testpool: version: 28 name: 'testpool' state: 0 txg: 162804 pool_guid: 14829240649900366534 hostname: 'BigMamba' vdev_children: 1 vdev_tree: type: 'root' id: 0 guid: 14829240649900366534 children[0]: type: 'raidz' id: 0 guid: 5355850150368902284 nparity: 1 metaslab_array: 31 metaslab_shift: 32 ashift: 9 asize: 791588896768 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 11426107064765252810 path: '/dev/disk/by-id/ata-ST3300620A_5QF0MJFP-part2' phys_path: '/dev/gptid/73b31683-537f-11e2-bad7-50465d4eb8b0' whole_disk: 1 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 15935140517898495532 path: '/dev/disk/by-id/ata-ST3300831A_5NF0552X-part2' phys_path: '/dev/gptid/746c949a-537f-11e2-bad7-50465d4eb8b0' whole_disk: 1 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 7183706725091321492 path: '/dev/disk/by-id/ata-ST3200822A_5LJ1CHMS-part2' phys_path: '/dev/gptid/7541115a-537f-11e2-bad7-50465d4eb8b0' whole_disk: 1 create_txg: 4 children[3]: type: 'disk' id: 3 guid: 17196042497722925662 path: '/dev/disk/by-id/ata-ST3200822A_3LJ0189C-part2' phys_path: '/dev/gptid/760a94ee-537f-11e2-bad7-50465d4eb8b0' whole_disk: 1 create_txg: 4 features_for_read:
NOTE: The GUID can be ascertained as 15935140517898495532.
Use the GUID to offline the old device:
zpool offline testpool 15935140517898495532
And check this has worked:
zpool status pool: testpool state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: scrub repaired 0 in 2h4m with 0 errors on Sun Jun 9 00:28:24 2013 config: NAME STATE READ WRITE CKSUM testpool DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 ata-ST3300620A_5QF0MJFP ONLINE 0 0 0 ata-ST3300831A_5NF0552X OFFLINE 0 0 0 ata-ST3200822A_5LJ1CHMS ONLINE 0 0 0 ata-ST3200822A_3LJ0189C ONLINE 0 0 0 errors: No known data errors
and then replace the pool:
zpool replace testpool 15935140517898495532 /dev/disk/by-id/ata-ST3500320AS_9QM03ATQ
And check again this has worked:
zpool status pool: testpool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Sun Jun 9 01:44:36 2013 408M scanned out of 419G at 20,4M/s, 5h50m to go 101M resilvered, 0,10% done config: NAME STATE READ WRITE CKSUM testpool DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 ata-ST3300620A_5QF0MJFP ONLINE 0 0 0 replacing-1 OFFLINE 0 0 0 ata-ST3300831A_5NF0552X OFFLINE 0 0 0 ata-ST3500320AS_9QM03ATQ ONLINE 0 0 0 (resilvering) ata-ST3200822A_5LJ1CHMS ONLINE 0 0 0 ata-ST3200822A_3LJ0189C ONLINE 0 0 0 errors: No known data errors
NOTE: If the old disk is already removed from the system and a new device has replaced it with the same device name, the following command can be used instead:
zpool offline testpool sdd
zpool remove testpool sdd
zpool attach -f testpool sdc sdd
Before the pool will be back to normal it will need to sync data over to the new disk.
The status of the resilvering can be checked:
zpool status testpool
Physically remove the old drive.