User Tools

Site Tools


zfs:troubleshooting:replace_a_disk

This is an old revision of the document!


ZFS - Troubleshooting - Replace a Disk

Check the Pool

Verify that a disk is bad and that it needs to be replaced.

zpool status

returns:

  pool: testpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub repaired 0 in 2h4m with 0 errors on Sun Jun  9 00:28:24 2013
config:
 
        NAME                         STATE     READ WRITE CKSUM
        testpool                     DEGRADED     0     0     0
          raidz1-0                   DEGRADED     0     0     0
            ata-ST3300620A_5QF0MJFP  ONLINE       0     0     0
            ata-ST3300831A_5NF0552X  UNAVAIL      0     0     0
            ata-ST3200822A_5LJ1CHMS  ONLINE       0     0     0
            ata-ST3200822A_3LJ0189C  ONLINE       0     0     0
 
errors: No known data errors

NOTE: This shows that one disk is Unavailable.


Add a New Disk

  • Add a new disk.
  • Optionally remove the old disk.

NOTE: Only remove the old drive at this point if it is a redundant setup.


Replace the Old Device

zpool replace testpool c1t1d0 c2t0d0
zpool offline testpool c1t1d0
zpool remove testpool c1t1d0

NOTE: Here the old device is specified first followed by the new device.

  • If the pool is a redundant configuration, data will be copied from other good disks to the new disk.
  • If the pool is not redundant, data will be copied from the old device to the new device.
  • The old drive should also become detached.
  • Once that is complete, the old device can be physically removed.

NOTE: If the old disk is already removed from the system and a new device has replaced it with the same device name, the following command can be used instead:

zpool offline testpool sdd
zpool remove testpool sdd
zpool attach -f testpool sdc sdd

Wait For Resilvering to Complete

Before the pool will be back to normal it will need to sync data over to the new disk.

  • It will remain in a degraded status while the data syncs.
  • This data syncing process is called resilvering.
  • It may take a very long time depending on the size of the disks and on how much data is on them.

The status of the resilvering can be checked:

zpool status testpool

Physically Remove the Old Drive

Physically remove the old drive.

  • If it is hot-swappable then just pull it out.
  • Otherwise, shutdown the system, before removing the device.

Potential Issues

If the bad disk has already been removed from the system you might not be able to specify it by ID.

  • If this is the case try specifying it by device name or by GUID:
zdb               # Find GUID.
zdb -l /dev/sda1  # In case the 'zdb' command does not work.
zpool status -g   # Find GUID.
zpool status -L   # Find device name, resolving links.

NOTE: If zdb does not output anything, try specifying the device.


References

zfs/troubleshooting/replace_a_disk.1634166894.txt.gz · Last modified: 2021/10/13 23:14 by peter

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki