ZFS: Setting up ZFS storage on Ubuntu
If you are new to ZFS, I would advise doing a little bit of research first to understand the fundamentals. Jim Salter’s articles on storage and ZFS are very recommended.
The examples below are to create a pool from a single disk, with separate datasets used for network backups.
In some examples, I might use device names for simplicity, but you are advised to use disks IDs or serials.
Ubuntu makes it very easy.
# apt install zfsutils-linux
ZFS Cockpit module
If Cockpit is installed, it is possible to install a module for ZFS. This module is sadly no longer in development. If you know of alternatives, please share!
$ git clone https://github.com/optimans/cockpit-zfs-manager.git [...] # cp -r cockpit-zfs-manager/zfs /usr/share/cockpit
Configuring automatic snapshots
This service generates automatic snapshots every hour, and it can be configured to retain your preferred period.
# apt install zfs-auto-snapshot
The snapshot retention is set in the following files:
/etc/cron.hourly/zfs-auto-snapshot /etc/cron.daily/zfs-auto-snapshot /etc/cron.weekly/zfs-auto-snapshot /etc/cron.monthly/zfs-auto-snapshot
By default, the configuration runs the following snapshots and retention policies:
I configured the following snapshot retention policy:
# vim /etc/cron.hourly/zfs-auto-snapshot
#!/bin/sh # Only call zfs-auto-snapshot if it's available which zfs-auto-snapshot > /dev/null || exit 0 exec zfs-auto-snapshot --quiet --syslog --label=hourly --keep=48 //
# vim /etc/cron.daily/zfs-auto-snapshot
#!/bin/sh # Only call zfs-auto-snapshot if it's available which zfs-auto-snapshot > /dev/null || exit 0 exec zfs-auto-snapshot --quiet --syslog --label=daily --keep=14 //
# vim /etc/cron.weekly/zfs-auto-snapshot
#!/bin/sh # Only call zfs-auto-snapshot if it's available which zfs-auto-snapshot > /dev/null || exit 0 exec zfs-auto-snapshot --quiet --syslog --label=weekly --keep=4 //
# vim /etc/cron.monthly/zfs-auto-snapshot
#!/bin/sh # Only call zfs-auto-snapshot if it's available which zfs-auto-snapshot > /dev/null || exit 0 exec zfs-auto-snapshot --quiet --syslog --label=monthly --keep=3 //
Setting up the ZFS pool
This post has several use cases and examples, and I recommend it highly if you want further details on different commands and ways to configure your pools.
In my example there is no resilience, as there is only one attached disk. For me, this is acceptable because I have an additional local backup besides this filesystem.
It is preferable to have a second backup (ideally off-site) than a single one regardless of any added resilience you might set.
I create a single pool with an external drive. Read below for an explanation of the different command flags.
zpool create -f -o ashift=12 -O compression=lz4 -O acltype=posixacl -O xattr=sa -O relatime=on -O atime=off -O normalization=formD -O canmount=off -O dnodesize=auto -O sync=standard backup_pool scsi-SSeagate_Desktop_NA7HP4VK
Block size / ashift
Of the above values, the most important one by far is ashift.
The ashift property sets the block size of the vdev. It can’t be changed once set, and if it isn’t correct, it will cause massive performance issues with the filesystem.
Find out your drive’s optimal block size and match it to ashift.
It is set in bits.
recordsize is another performance impacting property, especially on the Raspberry Pi. Smaller sizes can improve performance when accessing random batches, but higher values will provide better performance and compression when reading sequential data. The problem on the Raspberry Pi has been that with a value of 1M the system load increased, eventually stopping the filesystem activity until the system was restarted.
The default value (128k) has performed without any noticeable issue.
lz4 compression is going to yield an optimum performance/compression ratio. It will make the storage perform faster than if there is no compression.
ZFS 0.8 doesn’t give many choices regarding compression but bear in mind that you can change the algorithm on a live system.
gzip will impact performance but yields a higher compression rate. It might be worth checking the performance with different compression formats on the Pi 4. With older Raspberry Pi models, the limitation will be the USB / network in most cases.
For reference, on the same amount of data these were the compression ratios I obtained:
All in all, the performance impact and memory consumption didn’t make switching from lz4 worthwhile.
It enables the POSIX ACLs and Linux Extended Attributes on the inodes rather than on separate files.
atime is recommended to be disabled (off) to reduce the number of IOPS.
relatime offers a good compromise between the atime and notime behaviours.
The normalization property indicates whether a file system should perform a Unicode normalisation of file names whenever two file names are compared and which normalisation algorithm should be used.
formD is the default set by Canonical when setting up a pool. It seems to be a good choice if sharing the volume via NFS with macOS systems and avoiding files not being displayed due to names using non-ASCII characters.
The pool is configured with the canmount property off so that it can’t be mounted.
This is because I will be creating separate datasets, one for Time Capsule backups, and another two for Backintime, and I don’t want them to mix.
All datasets will share the same pool, but I don’t want the pool root to be mounted. Only datasets will mount.
dnodesize is set to auto, as per several recommendations when datasets are using the xattr=sa property.
sync is set as standard. There is a performance hit for writes, but disabling it comes at the expense of data consistency if there is a power cut or similar.
A brief test showed a lower system load when sync=standard than with sync=disabled. Also, with standard there were fewer spikes. It is likely that the performance is lower, but it certainly causes the system to suffer less.
I am not too keen to encrypt physically secure volumes because when doing data recovery, you are adding an additional layer that might hamper and slow things down.
For reference, I am writing down an example of encryption options using an external key for a volume. This might not be appropriate for your particular scenario. Research alternatives if needed.
-O encryption=aes-256-gcm -O keylocation=file:///etc/pool_encryption_key -O keyformat=raw
Automatic trimming of the pool is essential for SSDs:
# zpool set autotrim=on backup_pool
Disabling automatic mount for the pool. (This applies only to the root of the pool, the datasets can still be set to be mountable regardless of this setting.)
# zfs set canmount=off backup_pool
Setting up the ZFS datasets
I will create three separate datasets with assigned quotas for each.
[Create datasets] # zfs create backup_pool/backintime_tuxedo # zfs create backup_pool/backintime_ab350 # zfs create backup_pool/timecapsule [Set mountpoints] # zfs set mountpoint=/backups/backintime_tuxedo backup_pool/backintime_tuxedo # zfs set mountpoint=/backups/backintime_ab350 backup_pool/backintime_ab350 # zfs set mountpoint=/backups/timecapsule backup_pool/timecapsule [Set quotas] # zfs set quota=2T backup_pool/backintime_tuxedo # zfs set quota=2T backup_pool/backintime_ab350 # zfs set quota=2T backup_pool/timecapsule
Changing compression on a dataset
The default lz4 compression is recommended. gzip consumes a lot of CPU and makes data transfers slower, impacting backups restoration.
If you still want to change the compression for a given dataset:
# zfs set compression=gzip-7 backup_pool/timecapsule
A comparison of compression and decompression using different algorithms with OpenZFS:
Querying pool properties, current compression algorithm and compress ratio
# zfs get all backup_pool # zfs get compression backup_pool # zfs get compressratio backup_pool # zfs get all | grep compressratio
Changing ZFS settings
For reference, below are some examples of properties and settings that can be changed after a pool has already been created.
Renaming pools and datasets
If for any reason, a dataset was given a name that needs to be changed, this can be done with a command like this:
# zfs rename backup_pool/Test1 backup_pool/backintime_tuxedo
A zpool can be renamed by exporting and importing it.
# zpool export test_pool # zpool import test_pool backup_pool
Attaching mirror disks
You can add an additional disk/partition and make the pool redundant in a RAID-Z configuration. Unfortunately, it doesn’t work to make it a RAID-Z2 or RAID-Z3.
# zpool attach backup_pool /dev/sda7 /dev/sdb7
Renaming disks in pools
By default, Ubuntu uses device identifiers for the disks. This should not be an issue, but in some cases, adding or connecting drives might change the device name order and degrade one or more pools.
This is why creating a pool with disk IDs or serials is recommended. You can still fix this if you created your pool using device names.
With the pool unmounted, export it, and reimport pointing to the right path:
# zpool export backup_pool # zpool import -d /dev/disk/by-id/ backup_pool
There are additional examples in this handy blog post:
ZFS should be running on a system with at least 4GiB of RAM. If you plan to use it on a Raspberry Pi (or any other system with limited resources), reduce the ARC size.
In this case, I am limiting it to 3GiB. It is a change that can be done live:
# echo 3221225472 > /sys/module/zfs/parameters/zfs_arc_max
To make it persistent between boots:
# vim /etc/modprobe.d/zfs.conf [add this line] options zfs zfs_arc_max=3221225472 # update-initramfs -u
You can check the ARC statistics:
$ less /proc/spl/kstat/zfs/arcstats
More on ZFS performance
Some other links with interesting points on performance: