Checking ashift on existing pools

Today I found: Checking ashift on existing pools. In summary:

# zpool get all | grep ashift
# zpool get all | less
# zdb -C | grep ashift
# zdb -C | less
# zdb -U /etc/zfs/zpool.cache | less

Per ZFS 101—Understanding ZFS storage and performance you *really* want to make sure your ashift value is aligned with your disk’s sector size. ashift=9 for 512; ashift=12 for 4096; I’ve heard some SSDs can be 8K, but I haven’t been able to confirm for my own disks.

Replace a disk in a ZFS pool

So smartd is suddenly emailing me about problems with one of the disks in my ZFS zpool. I have ordered a replacement disk and am waiting for it to arrive. While the smartd email says there is a problem `zpool status` says everything is fine. So I’m running a `zpool scrub` to see if ZFS can pick up on the disk errors.

Preparing for the disk replacement I searched the web and found Replace a disk in a ZFS pool.

I found the serial number of the faulty disk with `lsblk -I 8 -d -o NAME,SIZE,SERIAL`. The process is then:

  1. Shutdown the server
  2. Replace the faulty disk
  3. Boot the server
  4. Run zpool replace: sudo zpool replace data sdc
  5. Check zpool status: sudo zpool status data

I hope it turns out to be that easy! Now I just wait for my scrub to complete and my disk to arrive.

Update

Click through on the link below for some excellent documentation about how to handle this error:

Every 2.0s: zpool status                                              love: Fri Apr 30 07:52:40 2021

  pool: data
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub in progress since Thu Apr 29 02:30:54 2021
        4.32T scanned out of 5.03T at 42.9M/s, 4h48m to go
        466K repaired, 85.93% done
config:

        NAME         STATE     READ WRITE CKSUM
        data         ONLINE       0     0     0
          mirror-0   ONLINE       0     0     0
            sda      ONLINE       0     0     0
            sdb      ONLINE       0     0     0
          mirror-1   ONLINE       0     0     0
            sdc      ONLINE       0     0     4  (repairing)
            sdd      ONLINE       0     0     0
        cache
          nvme0n1p4  ONLINE       0     0     0

errors: No known data errors