The ZFS verification process is called a ‘scrub’ and it is done periodically (usually every few weeks). To report on the status of a scrub in progress use the `zfs scrub` command.
Tag Archives: zfs
ZFS decuplication
Checking ashift on existing pools
Today I found: Checking ashift on existing pools. In summary:
# zpool get all | grep ashift # zpool get all | less # zdb -C | grep ashift # zdb -C | less # zdb -U /etc/zfs/zpool.cache | less
Per ZFS 101—Understanding ZFS storage and performance you *really* want to make sure your ashift value is aligned with your disk’s sector size. ashift=9 for 512; ashift=12 for 4096; I’ve heard some SSDs can be 8K, but I haven’t been able to confirm for my own disks.
drop_caches
If the ZFS ARC goes nuts with e.g. lots of arc_prune processes, try this:
# echo 3 > /proc/sys/vm/drop_caches
Thanks to Xe on #lobsters.
Replace a disk in a ZFS pool
So smartd is suddenly emailing me about problems with one of the disks in my ZFS zpool. I have ordered a replacement disk and am waiting for it to arrive. While the smartd email says there is a problem `zpool status` says everything is fine. So I’m running a `zpool scrub` to see if ZFS can pick up on the disk errors.
Preparing for the disk replacement I searched the web and found Replace a disk in a ZFS pool.
I found the serial number of the faulty disk with `lsblk -I 8 -d -o NAME,SIZE,SERIAL`. The process is then:
- Shutdown the server
- Replace the faulty disk
- Boot the server
- Run zpool replace: sudo zpool replace data sdc
- Check zpool status: sudo zpool status data
I hope it turns out to be that easy! Now I just wait for my scrub to complete and my disk to arrive.
Update
Click through on the link below for some excellent documentation about how to handle this error:
Every 2.0s: zpool status love: Fri Apr 30 07:52:40 2021 pool: data state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-9P scan: scrub in progress since Thu Apr 29 02:30:54 2021 4.32T scanned out of 5.03T at 42.9M/s, 4h48m to go 466K repaired, 85.93% done config: NAME STATE READ WRITE CKSUM data ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 sda ONLINE 0 0 0 sdb ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 sdc ONLINE 0 0 4 (repairing) sdd ONLINE 0 0 0 cache nvme0n1p4 ONLINE 0 0 0 errors: No known data errors
ZFS performance tuning
I read The Next Gen Database Servers Powering Let’s Encrypt which mentioned how they tuned ZFS.