zfs:troubleshooting:replace_a_disk
This is an old revision of the document!
Table of Contents
ZFS - Troubleshooting - Replace a Disk
Check the Pool
Verify that a disk is bad and that it needs to be replaced.
zpool status
returns:
pool: testpool state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-4J scan: scrub repaired 0 in 2h4m with 0 errors on Sun Jun 9 00:28:24 2013 config: NAME STATE READ WRITE CKSUM testpool DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 ata-ST3300620A_5QF0MJFP ONLINE 0 0 0 ata-ST3300831A_5NF0552X UNAVAIL 0 0 0 ata-ST3200822A_5LJ1CHMS ONLINE 0 0 0 ata-ST3200822A_3LJ0189C ONLINE 0 0 0 errors: No known data errors
NOTE: This shows that one disk is unavailable.
- This is ata-ST3300831A_5NF0552X.
Add a New Disk
- Add a new disk.
- Optionally remove the old disk.
NOTE: The new disk is ata-ST3500320AS_9QM03ATQ.
- This can be seen at /dev/disk/by-id/ata-ST3500320AS_9QM03ATQ.
- Only remove the old drive at this point if it is a redundant setup.
Replace the Old Device
zpool replace testpool c1t1d0 c2t0d0 zpool offline testpool c1t1d0 zpool remove testpool c1t1d0
NOTE: Here the old device is specified first followed by the new device.
- If the pool is a redundant configuration, data will be copied from other good disks to the new disk.
- If the pool is not redundant, data will be copied from the old device to the new device.
- Once that is complete, the old device can be physically removed.
Potential Issues
If the bad disk has already been removed from the system you might not be able to specify it by ID.
- If this is the case try specifying it by device name or by GUID:
zdb # Find GUID. zdb -l /dev/sda1 # In case the 'zdb' command does not work. zpool status -g # Find GUID. zpool status -L # Find device name, resolving links.
- If zdb does not output anything, try specifying the device.
NOTE: If the old disk is already removed from the system and a new device has replaced it with the same device name, the following command can be used instead:
zpool offline testpool sdd
zpool remove testpool sdd
zpool attach -f testpool sdc sdd
Wait For Resilvering to Complete
Before the pool will be back to normal it will need to sync data over to the new disk.
- It will remain in a degraded status while the data syncs.
- This data syncing process is called resilvering.
- It may take a very long time depending on the size of the disks and on how much data is on them.
The status of the resilvering can be checked:
zpool status testpool
Physically Remove the Old Drive
Physically remove the old drive.
- If it is hot-swappable then just pull it out.
- Otherwise, shutdown the system, before removing the device.
References
zfs/troubleshooting/replace_a_disk.1634168086.txt.gz · Last modified: 2021/10/13 23:34 by peter