User Tools

Site Tools


zfs:troubleshooting:replace_a_disk

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
zfs:troubleshooting:replace_a_disk [2021/10/13 23:24] – [Add a New Disk] peterzfs:troubleshooting:replace_a_disk [2021/10/13 23:58] (current) – [Replace the Old Device] peter
Line 59: Line 59:
  
 <code bash> <code bash>
-zpool replace testpool c1t1d0 c2t0d0 +zpool replace testpool /dev/disk/by-id/ata-ST3300831A_5NF0552X /dev/disk/by-id/ata-ST3500320AS_9QM03ATQ 
-zpool offline testpool c1t1d0 +zpool offline testpool /dev/disk/by-id/ata-ST3300831A_5NF0552X 
-zpool remove testpool c1t1d0+zpool detatch testpool /dev/disk/by-id/ata-ST3300831A_5NF0552X
 </code> </code>
  
Line 69: Line 69:
   * If the pool is a redundant configuration, data will be copied from other good disks to the new disk.   * If the pool is a redundant configuration, data will be copied from other good disks to the new disk.
   * If the pool is not redundant, data will be copied from the old device to the new device.   * If the pool is not redundant, data will be copied from the old device to the new device.
- 
-  * The old drive should also become detached. 
  
   * Once that is complete, the old device can be physically removed.   * Once that is complete, the old device can be physically removed.
Line 76: Line 74:
 </WRAP> </WRAP>
  
 +<WRAP important>
 +==== Potential Issues ====
 +
 +If the bad device has already been removed from the system, this might fail with the following error.
 +
 +<code bash>
 +cannot offline /dev/disk/by-id/ata-ST3300831A_5NF0552X: no such device in pool
 +</code>
 +
 +  * This is because the label of the drive that died does not exist in the system any more.
 +  * Therefore the bad device cannot be specified by ID.
 +  * If this case, try specifying it by device name or by GUID.
 +
 +----
 +
 +There are various ways to determine a GUID:
 +
 +<code bash>
 +zdb               # Find GUID.
 +zdb -l /dev/sda1  # In case the 'zdb' command does not work.
 +zpool status -g   # Find GUID.
 +zpool status -L   # Find device name, resolving links.
 +</code>
 +
 +----
 +
 +Try to get the GUID using zdb:
 +
 +<code bash>
 +zdb
 +testpool:
 +    version: 28
 +    name: 'testpool'
 +    state: 0
 +    txg: 162804
 +    pool_guid: 14829240649900366534
 +    hostname: 'BigMamba'
 +    vdev_children: 1
 +    vdev_tree:
 +        type: 'root'
 +        id: 0
 +        guid: 14829240649900366534
 +        children[0]:
 +            type: 'raidz'
 +            id: 0
 +            guid: 5355850150368902284
 +            nparity: 1
 +            metaslab_array: 31
 +            metaslab_shift: 32
 +            ashift: 9
 +            asize: 791588896768
 +            is_log: 0
 +            create_txg: 4
 +            children[0]:
 +                type: 'disk'
 +                id: 0
 +                guid: 11426107064765252810
 +                path: '/dev/disk/by-id/ata-ST3300620A_5QF0MJFP-part2'
 +                phys_path: '/dev/gptid/73b31683-537f-11e2-bad7-50465d4eb8b0'
 +                whole_disk: 1
 +                create_txg: 4
 +            children[1]:
 +                type: 'disk'
 +                id: 1
 +                guid: 15935140517898495532
 +                path: '/dev/disk/by-id/ata-ST3300831A_5NF0552X-part2'
 +                phys_path: '/dev/gptid/746c949a-537f-11e2-bad7-50465d4eb8b0'
 +                whole_disk: 1
 +                create_txg: 4
 +            children[2]:
 +                type: 'disk'
 +                id: 2
 +                guid: 7183706725091321492
 +                path: '/dev/disk/by-id/ata-ST3200822A_5LJ1CHMS-part2'
 +                phys_path: '/dev/gptid/7541115a-537f-11e2-bad7-50465d4eb8b0'
 +                whole_disk: 1
 +                create_txg: 4
 +            children[3]:
 +                type: 'disk'
 +                id: 3
 +                guid: 17196042497722925662
 +                path: '/dev/disk/by-id/ata-ST3200822A_3LJ0189C-part2'
 +                phys_path: '/dev/gptid/760a94ee-537f-11e2-bad7-50465d4eb8b0'
 +                whole_disk: 1
 +                create_txg: 4
 +    features_for_read:
 +</code>
 +
 +<WRAP info>
 +**NOTE:**  The GUID can be ascertained as 15935140517898495532.
 +</WRAP>
 +
 +Use the GUID to offline the old device:
 +
 +<code bash>
 +zpool offline testpool 15935140517898495532
 +</code>
 +
 +And check this has worked:
 +
 +<code bash>
 +zpool status
 +  pool: testpool
 + state: DEGRADED
 +status: One or more devices has been taken offline by the administrator.
 +        Sufficient replicas exist for the pool to continue functioning in a
 +        degraded state.
 +action: Online the device using 'zpool online' or replace the device with
 +        'zpool replace'.
 +  scan: scrub repaired 0 in 2h4m with 0 errors on Sun Jun  9 00:28:24 2013
 +config:
 +
 +        NAME                         STATE     READ WRITE CKSUM
 +        testpool                     DEGRADED             0
 +          raidz1-0                   DEGRADED             0
 +            ata-ST3300620A_5QF0MJFP  ONLINE               0
 +            ata-ST3300831A_5NF0552X  OFFLINE      0         0
 +            ata-ST3200822A_5LJ1CHMS  ONLINE               0
 +            ata-ST3200822A_3LJ0189C  ONLINE               0
 +
 +errors: No known data errors
 +</code>
 +
 +and then replace the pool:
 +
 +<code bash>
 +zpool replace testpool 15935140517898495532 /dev/disk/by-id/ata-ST3500320AS_9QM03ATQ
 +</code>
 +
 +And check again this has worked:
 +
 +<code bash>
 +zpool status
 +  pool: testpool
 + state: DEGRADED
 +status: One or more devices is currently being resilvered.  The pool will
 +        continue to function, possibly in a degraded state.
 +action: Wait for the resilver to complete.
 +  scan: resilver in progress since Sun Jun  9 01:44:36 2013
 +    408M scanned out of 419G at 20,4M/s, 5h50m to go
 +    101M resilvered, 0,10% done
 +config:
 +
 +        NAME                            STATE     READ WRITE CKSUM
 +        testpool                        DEGRADED             0
 +          raidz1-0                      DEGRADED             0
 +            ata-ST3300620A_5QF0MJFP     ONLINE               0
 +            replacing-1                 OFFLINE      0         0
 +              ata-ST3300831A_5NF0552X   OFFLINE      0         0
 +              ata-ST3500320AS_9QM03ATQ  ONLINE                (resilvering)
 +            ata-ST3200822A_5LJ1CHMS     ONLINE               0
 +            ata-ST3200822A_3LJ0189C     ONLINE               0
 +
 +errors: No known data errors
 +</code>
 +
 +</WRAP>
  
 <WRAP info> <WRAP info>
Line 113: Line 268:
   * If it is hot-swappable then just pull it out.   * If it is hot-swappable then just pull it out.
   * Otherwise, shutdown the system, before removing the device.   * Otherwise, shutdown the system, before removing the device.
- 
----- 
- 
-===== Potential Issues ===== 
- 
-If the bad disk has already been removed from the system you might not be able to specify it by ID. 
-  * If this is the case try specifying it by device name or by GUID: 
- 
-<code bash> 
-zdb               # Find GUID. 
-zdb -l /dev/sda1  # In case the 'zdb' command does not work. 
-zpool status -g   # Find GUID. 
-zpool status -L   # Find device name, resolving links. 
-</code> 
- 
-<WRAP info> 
-**NOTE:**  If zdb does not output anything, try specifying the device. 
-</WRAP> 
  
 ---- ----
zfs/troubleshooting/replace_a_disk.1634167440.txt.gz · Last modified: 2021/10/13 23:24 by peter

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki