Ubuntu 22.04 – Migrating from Firefox snap to Firefox apt

Using snaps might have its advantages, but the amount of RAM and CPU cycles that Firefox seems to take made me want to switch. The browser certainly feels more responsive.

Remove the Firefox snap.

# snap remove firefox

If you don't have an APT keyring create one, import the Mozilla APT repository signing key and add it to your sources.
# install -d -m 0755 /etc/apt/keyrings

$ wget -q https://packages.mozilla.org/apt/repo-signing-key.gpg -O- | sudo tee /etc/apt/keyrings/packages.mozilla.org.asc > /dev/null

$ echo "deb [signed-by=/etc/apt/keyrings/packages.mozilla.org.asc] https://packages.mozilla.org/apt mozilla main" | sudo tee -a /etc/apt/sources.list.d/mozilla.list > /dev/null

You need to change the Firefox apt priority to avoid the snap version being re-installed.

$ echo '
Package: *
Pin: origin packages.mozilla.org
Pin-Priority: 1000
' | sudo tee /etc/apt/preferences.d/mozilla

And install the Firefox apt.

# apt update && sudo apt install firefox

[Install a localised version if you need it or want it]

# apt install firefox-l10n-gb

When you launch Firefox it will now be the apt version. Remember to move/copy your profiles from the snap to the new version.

$ cp -a ~/snap/firefox/common/.mozilla/firefox/* ~/.mozilla/firefox/

Finally, launch Firefox and point to the right profile (and delete any you don’t want to keep).

This is best done using the Profile Manager (about:profiles):

https://support.mozilla.org/en-US/kb/profile-manager-create-remove-switch-firefox-profiles




Ubuntu: apt upgrades failing with “Cannot initiate the connection to ports.ubuntu.com”

While doing a distro upgrade with

# do-release-upgrade

I kept getting failures half-way stating

Cannot initiate the connection to ports.ubuntu.com:80

The errors showed several IPv6 addresses that couldn’t be reached. My router supports IPv6, but not my ISP. Somehow I was expecting that the router would be doing the translation or DNS resolution between the two but this wasn’t the case.

Disabling IPv6 on the router didn’t to much. I have a recollection that some of the services I am running on my Ubuntu server require IPv6 enabled or else the OS breaks. So it can’t be disabled for the whole OS.

Luckily you can configure apt to only use IPv4:

# apt-get -o Acquire::ForceIPv4=true update

This will automatically refresh the sources and next time you run apt it will complete the upgrade. If not, your problem lies somewhere else.




Ubuntu: apt error message “Key is stored in legacy trusted.gpg keyring”

After upgrading to Ubuntu 22.04 running apt shows an error message saying “Key is stored in legacy trusted.gpg keyring“:

# apt update

[..]
All packages are up-to-date.
W: https://apt.syncthing.net/dists/syncthing/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details.

The key needs to be exported from the legacy keyring and then imported back to the current system.

List the keys and find the key ID of the repository that is showing the error. In this case it is Syncthing.

# apt-key list

--------------------
pub   rsa2048 2014-12-29 [SC]
      37C8 4554 E7E0 A261 54E7  6E1E D26E 6ED0 0065 5A3E
uid           [ unknown] Syncthing Release Management <release@syncthing.net>
sub   rsa2048 2014-12-29 [E]

/etc/apt/trusted.gpg.d/ubuntu-keyring-2012-cdimage.gpg
[...]

Copy the last 8 characters of the key (00655A3E) and export it.

# apt-key export 00655A3E | gpg --dearmour -o /usr/share/keyrings/syncthing.gpg

Update the source file for the repository adding the exported key.

# vim /etc/apt/sources.list.d/syncthing.list

deb [arch=amd64 signed-by=/usr/share/keyrings/syncthing.gpg] https://apt.syncthing.net/ syncthing stable #Syncthing

Confirm that the error message is no longer showing.

# apt update

[...]                                                          
Hit:5 https://apt.syncthing.net syncthing InRelease                                                     
[...]
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
All packages are up-to-date.

Finally, remove the old signature.

# apt-key del 00655A3E



ZFS ‘Failed to start Mark current ZSYS boot as successful’ fix

On Ubuntu 20.04 after installing the NVIDIA driver 510 metapackage the system stopped booting.

It will either hang with a black screen and blinking cursor on the top left or show the following error message:

[FAILED] Failed to start Mark current ZSYS boot as successful.
See 'systemctl status zsys-commit.service' for details.
[  OK  ] Stopped User Manager for UID 1000.

Attempting to revert from a snapshot ends up with the same error message. This wasn’t the case on another separate system that had the same upgrade.

The “20.04 zys-commit.service fails” message is quite interesting and it seems that the overall cause is a mismatch of user/kernel zfs components.

These are the steps I followed to fix it. Many thanks to Lockszmith for his research in identifying the issue and finding a fix. He created two posts raising it, links provided here.

https://askubuntu.com/users/720005/lockszmith

https://askubuntu.com/questions/1388997/zsys-commit-service-fails-with-couldnt-promote-dataset-not-a-cloned-filesy

Fix

Restart Ubuntu and boot in recovery mode

[In GRUB]

*Advanced options for Ubuntu 20.04.3 LTS

[Select the first recovery option in the menu]
*Ubuntu 20.04.3 LTS, with Linux 5.xx.x-xx-generic (recovery mode)

[Wait for the system to load the menu and select:]
root

[Press Enter for Maintenance to get the CLI]

Check the reason for the error.

# systemctl status zsys-commit.service
[...]
 Feb 17 11:11:24 ab350 systemd[1] zsysctl[4068]: level=error msg="couldn't commit: couldn't promote dataset "rpool/ROOT/ubuntu_733qyk": couldn't promote "rpool/ROOT/ubuntu_733qyk": not a cloned filesystem"
 [...]

Attempting to promote it manually fails:

# zfs promote rpool/ROOT/ubuntu_733qyk

cannot promote `rpool/ROOT/ubuntu_733qyk` : not a cloned filesystem

Uninstall the NVIDIA drivers.

# dkms uninstall nvidia/510.47.03
# dkms remove nvidia/510.47.03 --all

Make sure you can connect to the internet. You can temporarily assign a DHCP address to one of the network interfaces.

# dhclient -v eno1
# ip address

Update the system and install a 3rd party ZFS set of tools.

# apt update
# apt upgrade
# apt autoremove

[Add 3rd party PPA for zfstools]
# add-apt-repository ppa:jonathonf/zfs
# apt update 

[Upgrade ZFS]
# apt upgrade

[If ZFS isn't upgraded, do it manually]
# apt install zfs-initramfs zfs-zed zfsutils-linux

It might take a bit to update. Reboot normally.

# reboot

It should boot normally.

If this doesn’t work for you, reboot in recovery mode again and promote the filesystem manually.

# zfs promote rpool/ROOT/ubuntu_733qyk

Sort graphical drivers

Revert to NVIDIA metapackage 470 (if this is what broke your system). Reboot, and fix resolution settings.

Upgrading back to 510 will bring the error back and make it even more difficult to fix. Don’t!

Things will only work if zfs and zfs-kmod match versions.

$ zfs --version
zfs-0.8.3-1ubuntu12.13
zfs-kmod-2.0.6-1ubuntu2
[boot in recovery mode]
# apt reinstall zfs-initramfs zfs-zed zfsutils-linux
# zfs promote rpool/ROOT/ubuntu_733qyk

[reboot in normal mode]
[Configure the 470 drivers]

Reverting to previous ZFS version

The system should now be back to normal, but you might want to revert to the mainline ZFS version despite the bug. After all, this was a hack to promote the filesystem and get it back to work.

# add-apt-repository --remove ppa:jonathonf/zfs

[Check that is has been removed]
$ apt policy

# apt update

[Pray]
# apt remove zfs-initramfs zfs-zed zfsutils-linux
# apt install zfs-initramfs zfs-zed zfsutils-linux

[Check the right version is installed]
# apt list --installed | grep zfs

# apt autoremove

[Pray harder]
# reboot

With that I managed to bring my system back to a working condition, but updating the drivers a second time made it fail again and I couldn’t fix it. A clean install of 20.04.3 doesn’t seem to exhibit this problem. Not sure what is the reason behind it but there are a few bugs open with Canonical regarding this.

I hope that 22.04 will bring a better ZFS version.




Raspberry Pi : Configuring a Time Capsule/Backintime server

In this post, I am setting up a Time Capsule and Backintime server. I am using a Raspberry Pi that has Ubuntu installed, with a USB disk that has been configured into a ZFS pool.

Setting up backup users

You are going to have to create users for each of the services/users that will be connecting to the server. You want to keep files and access as isolated as possible. As in a given user shouldn’t have any visibility or notion of other users’ backups. We are also creating accounts that can’t login into the system for Time Machine, only authenticate.

Check if there is an entry for nologin in:

$ cat /etc/shells

If there is no entry add it:

# vim /etc/shells
# /etc/shells: valid login shells
/bin/sh
/bin/bash
/bin/rbash
/bin/dash
/usr/bin/tmux
/usr/sbin/nologin

Create a generic user for the backups, or dedicated accounts for each user to increase security:

Generic user example:

# useradd -s /usr/sbin/nologin timemachine
# passwd timemachine

Dedicated user example:

# useradd -s /usr/sbin/nologin timemachine_john
# passwd timemachine_john

Note that useradd doesn’t create a home

If required, the default shell can be changed with:

# usermod -s /usr/sbin/nologin timemachine_john

Setting up backup user groups

If more than one system is going to be backed up it is advisable to use different accounts for each.

It is possible to isolate users by assigning them individual datasets, but that might create storage silos.

An alternative is to create individual users that belong to the same backup group. The backup group can access the backintime dataset, but not each other’s data.

Create the group.

# addgroup backupusers

Assign main group and secondary group (the secondary group would be the shared one).

# usermod -g timemachine_john -G backupusers timemachine_john

Although not required, you could force the UID and GID to be a specific one.

# usermod -u 1012 timemachine
# groupmod -g 1012 timemachine

Time Capsule

Install netatalk

Install netatalk from the repositories.

# apt install netatalk

Allow access to all the appropriate accounts to the directory where the backups are going to be written to:

# chown :timemachine_john /backups/timecapsule/
# chmod 775 /backups/timecapsule/

Edit the settings of the netatalk service so that that share can be seen with the name of your choice and work as a Time Capsule server.

# vim /etc/netatalk/AppleVolumes.default

Enter the following:

/backups/timecapsule "pi-capsule" options:tm

Note that you can give the capsule a name with spaces above.

Restart the service:

# systemctl restart netatalk

Check that netatalk has been installed correctly:

# afpd -V

afpd 3.1.12 - Apple Filing Protocol (AFP) daemon of Netatalk
[...]
afpd has been compiled with support for these features:

          AFP versions: 2.2 3.0 3.1 3.2 3.3 3.4 
         CNID backends: dbd last tdb 
      Zeroconf support: Avahi
  TCP wrappers support: Yes
         Quota support: Yes
   Admin group support: Yes
    Valid shell checks: Yes
      cracklib support: No
            EA support: ad | sys
           ACL support: Yes
          LDAP support: Yes
         D-Bus support: Yes
     Spotlight support: Yes
         DTrace probes: Yes

              afp.conf: /etc/netatalk/afp.conf
           extmap.conf: /etc/netatalk/extmap.conf
       state directory: /var/lib/netatalk/
    afp_signature.conf: /var/lib/netatalk/afp_signature.conf
      afp_voluuid.conf: /var/lib/netatalk/afp_voluuid.conf
       UAM search path: /usr/lib/netatalk//
  Server messages path: /var/lib/netatalk/msg/

Configure netatalk

# vim /etc/nsswitch.conf

Change this line:

hosts:          files mdns4_minimal [NOTFOUND=return] dns

to this:

hosts:          files mdns4_minimal [NOTFOUND=return] dns mdns4 mdns

Note that if you are running Netatalk 3.1.11 or above it is not necessary any more to create the /etc/avahi/services/afpd.service. Using this file will cause an error.

If you are running an older version go ahead, otherwise jump to the next section.

Create /etc/avahi/services/afpd.service as root

# vim /etc/avahi/services/afpd.service

and fill it up with:

<?xml version="1.0" standalone='no'?><!--*-nxml-*-->
<!DOCTYPE service-group SYSTEM "avahi-service.dtd">
<service-group>
        <name replace-wildcards="yes">%h</name>
        <service>
                <type>_afpovertcp._tcp</type>
                <port>548</port>
        </service>
        <service>
                <type>_device-info._tcp</type>
                <port>0</port>
                <txt-record>model=TimeCapsule</txt-record>
        </service>
</service-group>

Configure the AFP service

Edit the configuration file.

# vim /etc/netatalk/afp.conf
[Global]
; Global server settings
mimic model = TimeCapsule6,106

[pi-capsule]
path = /backups/timecapsule
time machine = yes

Check configuration and reload if needed:

# systemctl status avahi-daemon	

[restart if necessary]
# systemctl restart netatalk

[Make the service automatically start]
# systemctl enable netatalk.service

If you go to your Mac’s Time Machine preferences the new volume will be available and you can start using it.

netatalk troubleshooting

Some notes of things to check from the server side (Time Capsule server):

https://wiki.archlinux.org/index.php/Netatalk

Backintime setup

Configuring Backintime

Prepare users

If you have disabled passwords and are only using keys, you will need to temporarily change the security settings to allow Backintime to exchange keys.

On the remote system/Pi/server:

# vim /etc/ssh/sshd_config
PasswordAuthentication yes
# systemctl restart ssh

Backintime uses SSH, so the user accounts need to be allowed to login. Therefore the default login shell needs to reflect this.

If not created already, assign the user a home directory. Finally, allow the user to read and write the folder containing the backups.

# usermod -s /usr/bin/bash backintime_john

# mkdir /home/backintime_john

# chown backintime_tuxedo:backintime_john /home/backintime_john/

# usermod -d /home/backintime_john/ backintime_john

# chown :backupusers /backups/backintime/

# chmod 775 /backups/backintime/

Permissions for some of the subfolders might be required in multi-user configuration after the first backup:

# chown :backupusers /backups/backintime*/backintime

# chmod 770 /backups/backintime*/backintime/system1/
# chmod 770 /backups/backintime*/backintime/laptop2/

Prepare keys

To simplify things these are the roles:

[Local system]
The client machine that is running Backintime and that you want to backup your data from.

[Remote system]
The SSH server that has the storage where your backup is going to be stored.

From the local system account you want to run backintime (either your user or root, depending on how you run Backintime) SSH into the remote system. In my case, a Raspberry Pi.

# ssh backintime_john@pi-capsule.local

After logging in check the host key.

$ ssh-keygen -l -f /etc/ssh/ssh_host_ecdsa_key.pub
256 SHA256:KjzU6aGqH6tXri/K87xz3H+cP35PMT7n+Ob6MIaBZb0 root@pi-capsule (ECDSA)

You can then log out from the remote machine.

From the local account, you want to run Backintime from generate a new SSH key pair.

# ssh-keygen

And then copy the public key to the Pi.

# ssh-copy-id -i ~/.ssh/id_rsa.pub backintime_david@pi-capsule
[...]
ECDSA key fingerprint is SHA256:KjzU6aGqH6tXri/K87xz3H+cP35PMT7n+Ob6MIaBZb0.
[...]
Number of key(s) added: 1
[...]

Note that the fingerprint is the same as the one displayed in the previous step.

Configure Backintime profile

You can now configure the SSH profile from Backintime and make the first run.

In the General tab:

Mode:               SSH

SSH Settings
Host:   pi-capsule
Port:   22
User:   backintime_david
Path:   /backups/backintime_david
Cipher:     [Leave as default]
Private Key:/root/.ssh/id_rsa

Password
SSH private key:[empty in most cases]
Enable Cache Password

Advanced
Host:       tuxedo
User:       root
Profile:    2

Schedule
[Select appropriate setting after testing]
Include

/home
/etc/
/boot
/root
/steam
/opt
/usr
/var
Exclude (example)

/steam/steamapps/downloading
/var/cache
/vm/kvm_images/__security/Security_TryHackMe*.qcow2
Auto-remove
Older than 10 years
If free space is less than 50GiB
If free inodes is less than 2%

Smart remove
Run in background on remote Host
Keep last
14 days (7 days)
21 days (14 days)
8 weeks (6 weeks)
36 months (14 months)

Don't remove named snapshots
Options
Enable notifications
Backup replaced files on restore
Continue on errors (keep incomplete snapshots)
Log level: Changes & Errors

After the first run has completed you can check which is the best performing cipher from the CLI.

# backintime benchmark-cipher --profile-id 2

After a few rounds, aes192-ctr came out as the best performing cipher for me.

Secure SSH

If you changed the SSH configuration at the beginning, after setting everything up, remember to secure SSH again on the server/remote system.

# vim /etc/ssh/sshd_config
PasswordAuthentication no
# systemctl restart ssh

Restoring restrictions to backup users

The login account is required for Backintime to be able to run rsync. It is worth doing a bit more research on how to harden/limit these accounts.

Troubleshooting

Some examples of some issues and some troubleshooting steps you can apply.

Time Capsule can’t be reached / firewall settings

Make sure the server is allowing AFP connections from the Mac client.

# ufw allow proto tcp from CLIENT_IP to PI_CAPSULE_IP port 548

Time Capsule – Configuring Time Machine backups via the network on a macOS VM

The destination needs to be configured manually.

Mount the AFP/Time Capsule mount via the Finder.

In the CLI configure the destination:

# tmutil setdestination -a /Volumes/pi-capsule

The backups can then be started from the GUI.

You can get information about the current configured destinations via the CLI.

# tmutil destinationinfo
====================================
Name            : pi-capsule
Kind            : Network
Mount Point     : /Volumes/pi-capsule
ID              : 7B648734-9BFC-417F-B5A1-F31B8AD52F4B

Time Capsule – Checking backup status

# tmutil currentphase
# tmutil status

ZFS stalling on a Raspberry Pi

Check the recordsize property. Reduce it to the default 128 kiB.

Reduce ARC size to reduce the amount of memory consumed/reserved for ZFS.

Understanding rsync logs

The logs indicate the type of change rsync is seeing. A reference is available here:

XYcstpoguax  path/to/file
|||||||||||
||||||||||╰- x: The extended attribute information changed
|||||||||╰-- a: The ACL information changed
||||||||╰--- u: The u slot is reserved for future use
|||||||╰---- g: Group is different
||||||╰----- o: Owner is different
|||||╰------ p: Permission are different
||||╰------- t: Modification time is different
|||╰-------- s: Size is different
||╰--------- c: Different checksum (for regular files), or
||              changed value (for symlinks, devices, and special files)
|╰---------- the file type:
|            f: for a file,
|            d: for a directory,
|            L: for a symlink,
|            D: for a device,
|            S: for a special file (e.g. named sockets and fifos)
╰----------- the type of update being done::
             <: file is being transferred to the remote host (sent)
             >: file is being transferred to the local host (received)
             c: local change/creation for the item, such as:
                - the creation of a directory
                - the changing of a symlink,
                - etc.
             h: the item is a hard link to another item (requires 
                --hard-links).
             .: the item is not being updated (though it might have
                attributes that are being modified)
             *: means that the rest of the itemized-output area contains
                a message (e.g. "deleting")

Some example output:

>f+++++++++ some/dir/new-file.txt
.f....og..x some/dir/existing-file-with-changed-owner-and-group.txt
.f........x some/dir/existing-file-with-changed-unnamed-attribute.txt
>f...p....x some/dir/existing-file-with-changed-permissions.txt
>f..t..g..x some/dir/existing-file-with-changed-time-and-group.txt
>f.s......x some/dir/existing-file-with-changed-size.txt
>f.st.....x some/dir/existing-file-with-changed-size-and-time-stamp.txt 
cd+++++++++ some/dir/new-directory/
.d....og... some/dir/existing-directory-with-changed-owner-and-group/
.d..t...... some/dir/existing-directory-with-different-time-stamp/ 



ZFS: Setting up ZFS storage on Ubuntu

If you are new to ZFS, I would advise doing a little bit of research first to understand the fundamentals. Jim Salter’s articles on storage and ZFS are very recommended.

https://arstechnica.com/information-technology/2020/05/zfs-101-understanding-zfs-storage-and-performance/

The examples below are to create a pool from a single disk, with separate datasets used for network backups.

In some examples, I might use device names for simplicity, but you are advised to use disks IDs or serials.

Installing ZFS

Ubuntu makes it very easy.

# apt install zfsutils-linux

ZFS Cockpit module

If Cockpit is installed, it is possible to install a module for ZFS. This module is sadly no longer in development. If you know of alternatives, please share!

$ git clone https://github.com/optimans/cockpit-zfs-manager.git
[...]
# cp -r cockpit-zfs-manager/zfs /usr/share/cockpit

Configuring automatic snapshots

This service generates automatic snapshots every hour, and it can be configured to retain your preferred period.

# apt install zfs-auto-snapshot

The snapshot retention is set in the following files:

/etc/cron.hourly/zfs-auto-snapshot
/etc/cron.daily/zfs-auto-snapshot
/etc/cron.weekly/zfs-auto-snapshot
/etc/cron.monthly/zfs-auto-snapshot

By default, the configuration runs the following snapshots and retention policies:

Period Retention
Hourly 24 hours
Daily 31 days
Weekly Eight weeks
Monthly 12 months

I configured the following snapshot retention policy:

Period Retention
Hourly 48 hours
Daily 14 days
Weekly Four weeks
Monthly Three months

Hourly

# vim /etc/cron.hourly/zfs-auto-snapshot
#!/bin/sh

# Only call zfs-auto-snapshot if it's available
which zfs-auto-snapshot > /dev/null || exit 0

exec zfs-auto-snapshot --quiet --syslog --label=hourly --keep=48 //

Daily

# vim /etc/cron.daily/zfs-auto-snapshot
#!/bin/sh

# Only call zfs-auto-snapshot if it's available
which zfs-auto-snapshot > /dev/null || exit 0

exec zfs-auto-snapshot --quiet --syslog --label=daily --keep=14 //

Weekly

# vim /etc/cron.weekly/zfs-auto-snapshot
#!/bin/sh

# Only call zfs-auto-snapshot if it's available
which zfs-auto-snapshot > /dev/null || exit 0

exec zfs-auto-snapshot --quiet --syslog --label=weekly --keep=4 //
Monthly
# vim /etc/cron.monthly/zfs-auto-snapshot
#!/bin/sh

# Only call zfs-auto-snapshot if it's available
which zfs-auto-snapshot > /dev/null || exit 0

exec zfs-auto-snapshot --quiet --syslog --label=monthly --keep=3 //

Setting up the ZFS pool

This post has several use cases and examples, and I recommend it highly if you want further details on different commands and ways to configure your pools.

https://www.thegeekdiary.com/zfs-tutorials-creating-zfs-pools-and-file-systems/

In my example there is no resilience, as there is only one attached disk. For me, this is acceptable because I have an additional local backup besides this filesystem.

It is preferable to have a second backup (ideally off-site) than a single one regardless of any added resilience you might set.

I create a single pool with an external drive. Read below for an explanation of the different command flags.

zpool create -f 
-o ashift=12 
-O compression=lz4 
-O acltype=posixacl 
-O xattr=sa 
-O relatime=on 
-O atime=off 
-O normalization=formD 
-O canmount=off 
-O dnodesize=auto 
-O sync=standard 
backup_pool scsi-SSeagate_Desktop_NA7HP4VK

Block size / ashift

Of the above values, the most important one by far is ashift.

The ashift property sets the block size of the vdev. It can’t be changed once set, and if it isn’t correct, it will cause massive performance issues with the filesystem.

Find out your drive’s optimal block size and match it to ashift.

It is set in bits.

bits sector size
9 512 bytes
10 1 kiB
11 2 kiB
12 4 kiB
13 8 kiB
14 16 kiB
15 32 kiB
16 64 kiB

recordsize is another performance impacting property, especially on the Raspberry Pi. Smaller sizes can improve performance when accessing random batches, but higher values will provide better performance and compression when reading sequential data. The problem on the Raspberry Pi has been that with a value of 1M the system load increased, eventually stopping the filesystem activity until the system was restarted.

The default value (128k) has performed without any noticeable issue.

Compression

lz4 compression is going to yield an optimum performance/compression ratio. It will make the storage perform faster than if there is no compression.

ZFS 0.8 doesn’t give many choices regarding compression but bear in mind that you can change the algorithm on a live system.

gzip will impact performance but yields a higher compression rate. It might be worth checking the performance with different compression formats on the Pi 4. With older Raspberry Pi models, the limitation will be the USB / network in most cases.

For reference, on the same amount of data these were the compression ratios I obtained:

gzip-7
backup_pool 1.34x
backup_pool/backintime 1.35x
backup_pool/timecapsule 1.33x

lz4
backup_pool 1.27x
backup_pool/backintime 1.30x
backup_pool/timecapsule 1.33x

All in all, the performance impact and memory consumption didn’t make switching from lz4 worthwhile.

Permissions

acltype=posixacl
xattr=sa

It enables the POSIX ACLs and Linux Extended Attributes on the inodes rather than on separate files.

Access times

atime is recommended to be disabled (off) to reduce the number of IOPS.

relatime offers a good compromise between the atime and notime behaviours.

Normalisation

The normalization property indicates whether a file system should perform a Unicode normalisation of file names whenever two file names are compared and which normalisation algorithm should be used.

formD is the default set by Canonical when setting up a pool. It seems to be a good choice if sharing the volume via NFS with macOS systems and avoiding files not being displayed due to names using non-ASCII characters.

Additional properties

The pool is configured with the canmount property off so that it can’t be mounted.

This is because I will be creating separate datasets, one for Time Capsule backups, and another two for Backintime, and I don’t want them to mix.

All datasets will share the same pool, but I don’t want the pool root to be mounted. Only datasets will mount.

dnodesize is set to auto, as per several recommendations when datasets are using the xattr=sa property.

sync is set as standard. There is a performance hit for writes, but disabling it comes at the expense of data consistency if there is a power cut or similar.

A brief test showed a lower system load when sync=standard than with sync=disabled. Also, with standard there were fewer spikes. It is likely that the performance is lower, but it certainly causes the system to suffer less.

Encryption

I am not too keen to encrypt physically secure volumes because when doing data recovery, you are adding an additional layer that might hamper and slow things down.

For reference, I am writing down an example of encryption options using an external key for a volume. This might not be appropriate for your particular scenario. Research alternatives if needed.

-O encryption=aes-256-gcm 
-O keylocation=file:///etc/pool_encryption_key 
-O keyformat=raw 

Pool options

Automatic trimming of the pool is essential for SSDs:

# zpool set autotrim=on backup_pool

Disabling automatic mount for the pool. (This applies only to the root of the pool, the datasets can still be set to be mountable regardless of this setting.)

# zfs set canmount=off backup_pool

Setting up the ZFS datasets

I will create three separate datasets with assigned quotas for each.

[Create datasets]
# zfs create backup_pool/backintime_tuxedo
# zfs create backup_pool/backintime_ab350
# zfs create backup_pool/timecapsule

[Set mountpoints]
# zfs set mountpoint=/backups/backintime_tuxedo  backup_pool/backintime_tuxedo
# zfs set mountpoint=/backups/backintime_ab350  backup_pool/backintime_ab350
# zfs set mountpoint=/backups/timecapsule  backup_pool/timecapsule

[Set quotas]
# zfs set quota=2T backup_pool/backintime_tuxedo
# zfs set quota=2T backup_pool/backintime_ab350
# zfs set quota=2T backup_pool/timecapsule

Changing compression on a dataset

The default lz4 compression is recommended. gzip consumes a lot of CPU and makes data transfers slower, impacting backups restoration.

If you still want to change the compression for a given dataset:

# zfs set compression=gzip-7 backup_pool/timecapsule

A comparison of compression and decompression using different algorithms with OpenZFS:

https://github.com/openzfs/zfs/pull/9735

Querying pool properties, current compression algorithm and compress ratio

# zfs get all backup_pool
# zfs get compression backup_pool
# zfs get compressratio backup_pool
# zfs get all | grep compressratio

Changing ZFS settings

For reference, below are some examples of properties and settings that can be changed after a pool has already been created.

Renaming pools and datasets

If for any reason, a dataset was given a name that needs to be changed, this can be done with a command like this:

# zfs rename backup_pool/Test1 backup_pool/backintime_tuxedo

A zpool can be renamed by exporting and importing it.

# zpool export test_pool
# zpool import test_pool backup_pool

Attaching mirror disks

You can add an additional disk/partition and make the pool redundant in a RAID-Z configuration. Unfortunately, it doesn’t work to make it a RAID-Z2 or RAID-Z3.

# zpool attach backup_pool /dev/sda7 /dev/sdb7

Renaming disks in pools

By default, Ubuntu uses device identifiers for the disks. This should not be an issue, but in some cases, adding or connecting drives might change the device name order and degrade one or more pools.

This is why creating a pool with disk IDs or serials is recommended. You can still fix this if you created your pool using device names.

With the pool unmounted, export it, and reimport pointing to the right path:

# zpool export backup_pool
# zpool import -d /dev/disk/by-id/ backup_pool

There are additional examples in this handy blog post:

https://plantroon.com/changing-disk-identifiers-in-zpool/

ZFS optimisation

ZFS should be running on a system with at least 4GiB of RAM. If you plan to use it on a Raspberry Pi (or any other system with limited resources), reduce the ARC size.

In this case, I am limiting it to 3GiB. It is a change that can be done live:

# echo 3221225472 > /sys/module/zfs/parameters/zfs_arc_max

To make it persistent between boots:

# vim /etc/modprobe.d/zfs.conf

[add this line]
options zfs zfs_arc_max=3221225472

# update-initramfs -u

You can check the ARC statistics:

$ less /proc/spl/kstat/zfs/arcstats

More on ZFS performance

Some other links with interesting points on performance:

https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Workload%20Tuning.html

https://icesquare.com/wordpress/how-to-improve-zfs-performance/




Linux / Ubuntu / hdparm: Identifying drive features and setting sleep patterns

Preparing the storage

Install hdparm and smartmontools

Install hdparm and the SMART monitoring tools.

# apt install hdparm smartmontools

Identify the right hard drive

Make sure you identify the correct drive, as some of the commands will destroy data. If you don’t understand the commands, then check them first. You have been warned.

Identify the block size

Knowing the block size of the device is important. It will help optimising writes, and in the case of SSD or flash drives avoid write amplification and wear and tear.

[List details of all drives]

# fdisk -l

[...]
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 512 bytes / 4096 bytes
[..]

[List details of a specific drive]

# fdisk -l /dev/sda
[...]
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
[...]
# smartctl --all /dev/sda
[...]
Sector Sizes:     512 bytes logical, 4096 bytes physical
[...]

Pay attention to the physical/optimal size. This is the one that matters.

SSDs will hide the true size of the pages and blocks. Even the same drive models might be built with different components, so getting it right is tricky.

Some suggest that 4kB is a generally good size for SSDs: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ssd-server-storage-applications-paper.pdf

Use the drive’s sector physical size to match the ZFS ashift (block size).

Retrieve drive IDs

When setting ZFS pools or using disk tools it is best to avoid using device names as they can easily change their order. Using the drive ID or serial will ensure that no matter in which port or in which order the drives are plugged it will be the correct drive selected.

This matters with any disk accessing utility if you have several drives, or will be inserting external drives regularly.

$ ls -l /dev/disk/by-id/
[...]

lrwxrwxrwx 1 root root  9 Mar  9 13:16 usb-TOSHIBA_External_USB_3.0_20150612015531-0:0 -> ../../sda

[...]

You can also extract model and serial numbers with hdparm.

# hdparm -i /dev/sda

/dev/sda:

 Model=WDC WD10EZEX-08WN4A0, FwRev=01.01A01, SerialNo=WD-WCC6Y5FXAPHV
 [...]

Even better, depending on the use of the drive, and if there is a plan to add mirror drives, is to partition the drive to ensure there is enough space if a different drive model is later added. Although I believe ZFS already does this and rounds down partitions using Mebibytes.

Test for damaged sectors

An additional and optional step is to test the hard drive for damaged sectors. This kind of test tends to be destructive so it is best if it is done before configuring the pools.

badblocks is a useful tool to achieve this.

It is installed by default, but if not you can do it manually.

# apt install e2fsprogs

A destructive test can be done with:

# badblocks -wsv -b 4096 /dev/sda

If you want to run the test while preserving the disk data you can run it in a non-destructive way. This will take longer.

# badblocks -nsv -b 4096 /dev/sda

ZFS has built-in checks and protection so in most cases you can skip this step.

Setting hard drive sleep patterns

Above I explained that using disk IDs is always a better idea. For simplicity, I will be using device names in several examples below, but I still advise using IDs or serials.

Check if the disk supports sleep

Check if the drive supports standby.

# hdparm -y /dev/sda

If supported the output will be:

/dev/sda:
 issuing standby command

Any other output might indicate that the drive doesn’t support sleep, or that a different tool/setting might be required.

Next, check if the drive supports write cache:

# hdparm -I /dev/sda | grep -i 'Write cache'

The expected output is:

           *    Write cache

The * indicates that the feature is supported.

An example of a complete hdparm output from a drive is shown below for reference. Different drives, with different features, will show different output, or even none at all.

# hdparm -I /dev/sda

/dev/sda:

ATA device, with non-removable media
        Model Number:       TOSHIBA MD04ACA500                      
        Serial Number:      55OBK0SPFPHC
        Firmware Revision:  FP2A    
        Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
        Supported: 8 7 6 5 
        Likely used: 8
Configuration:
        Logical         max     current
        cylinders       16383   16383
        heads           16      16
        sectors/track   63      63
        --
        CHS current addressable sectors:    16514064
        LBA    user addressable sectors:   268435455
        LBA48  user addressable sectors:  9767541168
        Logical  Sector size:                   512 bytes
        Physical Sector size:                  4096 bytes
        Logical Sector-0 offset:                  0 bytes
        device size with M = 1024*1024:     4769307 MBytes
        device size with M = 1000*1000:     5000981 MBytes (5000 GB)
        cache/buffer size  = unknown
        Form Factor: 3.5 inch
        Nominal Media Rotation Rate: 7200
Capabilities:
        LBA, IORDY(can be disabled)
        Queue depth: 32
        Standby timer values: spec'd by Standard, no device specific minimum
        R/W multiple sector transfer: Max = 16  Current = 16
        Advanced power management level: 128
        DMA: sdma0 sdma1 sdma2 mdma0 mdma1 *mdma2 udma0 udma1 udma2 udma3 udma4 udma5 
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4 
             Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
           *    SMART feature set
                Security Mode feature set
           *    Power Management feature set
           *    Write cache
           *    Look-ahead
           *    Host Protected Area feature set
           *    WRITE_BUFFER command
           *    READ_BUFFER command
           *    NOP cmd
           *    DOWNLOAD_MICROCODE
           *    Advanced Power Management feature set
                SET_MAX security extension
           *    48-bit Address feature set
           *    Device Configuration Overlay feature set
           *    Mandatory FLUSH_CACHE
           *    FLUSH_CACHE_EXT
           *    SMART error logging
           *    SMART self-test
           *    General Purpose Logging feature set
           *    WRITE_{DMA|MULTIPLE}_FUA_EXT
           *    64-bit World wide name
           *    WRITE_UNCORRECTABLE_EXT command
           *    {READ,WRITE}_DMA_EXT_GPL commands
           *    Segmented DOWNLOAD_MICROCODE
                unknown 119[7]
           *    Gen1 signaling speed (1.5Gb/s)
           *    Gen2 signaling speed (3.0Gb/s)
           *    Gen3 signaling speed (6.0Gb/s)
           *    Native Command Queueing (NCQ)
           *    Host-initiated interface power management
           *    Phy event counters
           *    Host automatic Partial to Slumber transitions
           *    Device automatic Partial to Slumber transitions
           *    READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
                DMA Setup Auto-Activate optimization
                Device-initiated interface power management
           *    Software settings preservation
           *    SMART Command Transport (SCT) feature set
           *    SCT Write Same (AC2)
           *    SCT Error Recovery Control (AC3)
           *    SCT Features Control (AC4)
           *    SCT Data Tables (AC5)
           *    reserved 69[3]
Security: 
        Master password revision code = 65534
                supported
        not     enabled
        not     locked
        not     frozen
        not     expired: security count
                supported: enhanced erase
        more than 508min for SECURITY ERASE UNIT. more than 508min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 500003964bc01970
        NAA             : 5
        IEEE OUI        : 000039
        Unique ID       : 64bc01970
Checksum: correct

An example of a complete smartctl output from a drive is shown below also for reference. As mentioned earlier, different systems will generate different outputs.

# smartctl --all /dev/sda
smartctl 7.1 2019-12-30 r5022 [aarch64-linux-5.4.0-1029-raspi] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Toshiba 3.5" MD04ACA... Enterprise HDD
Device Model:     TOSHIBA MD04ACA500
Serial Number:    55OBK0SPFPHC
LU WWN Device Id: 5 000039 64bc01970
Firmware Version: FP2A
User Capacity:    5,000,981,078,016 bytes [5.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Mar  8 15:02:10 2021 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Status not supported: Incomplete response, ATA output registers missing
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  120) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 533) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       9003
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       9222
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   084   084   000    Old_age   Always       -       6418
 10 Spin_Retry_Count        0x0033   253   100   030    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       9212
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       482
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       104
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       9225
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       37 (Min/Max 15/72)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       -       0
220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       0
222 Loaded_Hours            0x0032   085   085   000    Old_age   Always       -       6393
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0
226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       214
240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      5617         -
# 2  Short offline       Completed without error       00%      4702         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

More information about hdparm and smartctl is available on the following sites.

hdparm

https://wiki.archlinux.org/index.php/Hdparm#Power_management_configuration

http://www.htpcguides.com/spin-down-and-manage-hard-drive-power-on-raspberry-pi/

http://www.linux-magazine.com/Online/Features/Tune-Your-Hard-Disk-with-hdparm

smartctl

https://codeyarns.com/2016/12/21/how-to-use-smartctl/

https://linuxhandbook.com/check-ssd-health/

Configure the drive standby

Check the current standby configuration.

# hdparm -B /dev/sd[a-e]

/dev/sda:
 APM_level  = not supported

/dev/sdb:
 APM_level  = 254

/dev/sdc:
 APM_level  = not supported

/dev/sdd:
 APM_level  = 254

/dev/sde:
 APM_level  = 254
Values Description
1 to 127 Power management is enabled. The lower the value the more aggressive the power management will be.
128 to 254 Power management is enabled but doesn’t allow spindown
255 The feature is disabled.
not supported The drive doesn’t support APM.

The status can be set manually:

# hdparm -B 127 /dev/sda

The IDE power mode status can be queried with:

# hdparm -C /dev/sd[ab]

/dev/sda:
 drive state is:  active/idle

/dev/sdb:
 drive state is:  standby

For reference, several drives can be queried at the same time using different wildcards.

# hdparm -B /dev/sd?
# hdparm -C /dev/sd*
# hdparm -I /dev/sd[a-e]

Depending on the drive manufacturer and model you might need to query the settings with different flags. Check the man page.

[Get/set  the  Western  Digital Green Drive's "idle3" timeout value.]
# hdparm -J /dev/sd[a-e]

/dev/sda:
 wdidle3      = 300 secs (or 13.8 secs for older drives)

/dev/sdb:
 wdidle3      = 8.0 secs

/dev/sdc:
 wdidle3      = 300 secs (or 13.8 secs for older drives)

/dev/sdd:
 wdidle3      = 300 secs (or 13.8 secs for older drives)

/dev/sde:
 wdidle3      = 300 secs (or 13.8 secs for older drives)

From the man page:

A setting of 30 seconds is recommended for Linux use. Permitted values are from 8 to 12 seconds, and from 30 to 300 seconds in 30-second increments. Specify a value of zero (0) to disable the WD idle3 timer completely (NOT RECOMMENDED!).

There are flags for temperature (-H for Hitachi drives), acoustic management (-M), measuring cache performance (-T), and others. Go on, read that man page. 🙂

The -S flag sets the standby/spindown timeout for the drive. Basically, how long the drive will wait with no disk activity before turning off the motor.

Value Description
0 Disable the feature.
1 to 240 Five seconds multiples (a value of 120 means 10 minutes).
241 to 251 Thirty minutes intervals (a value of 242 means 1 hour).

Note that hdparm might wake the drive up when is queried. smartctl can query the drive without waking it.

# smartctl -i -d auto -n standby /dev/sda

Making the hdparm configuration persistent

Information about all the options is available at https://manpages.ubuntu.com/manpages/bionic/man5/hdparm.conf.5.html and also in the default configuration file generated by hdparm.

Example values from the data gathered above:

# APM setting (-B)
apm = 127

# APM setting while on battery (-B)
apm_battery = 127

# on/off drive's write caching feature (-W)
write_cache = on

# Standby (spindown) timeout for drive (-S)
spindown_time = 120

# Western  Digital  (WD)  Green Drive's "idle3" timeout value. (-J)
wdidle3 = 300

hdparm.conf method

Edit the configuration file:

# vim /etc/hdparm.conf

And insert an entry for each drive. Select only settings/features/values that are supported by that drive, otherwise the rest of the options won’t be applied. Test, test, test!

# Drive A
/dev/disk/by-id/ata-WDC_WD40NMZM-59Y94S1_WD-WX41D296P1XX {
apm = 127
apm_battery = 127
write_cache = on
spindown_time = 120
#wdidle3 = 300
}

udev method

In my case, the above method works. I couldn’t get this one to work on my system, but it could be because of the OS. I am leaving it for reference in case it might be of help.

# vim /etc/udev/rules.d/69-disk.rules

Create an entry for each drive editing the serial number and hdparm parameters. Make sure that only supported flags are added or it will fail.

ACTION=="add", KERNEL=="sd[a-z]", ENV{ID_SERIAL_SHORT}=="S3R14LNUMB3R", RUN+="/usr/bin/hdparm -B 127 -S 120 /dev/%k"

You can also apply the same parameters to all rotational drives (all non-SSD ones) in one go.

ACTION=="add|change", KERNEL=="sd[a-z]", ATTRS{queue/rotational}=="1", RUN+="/usr/bin/hdparm -B 127 -S 120 /dev/%k"



NTP: Setting up an NTP server

Setting up an NTP server

chrony is the default service on newer OS releases (Red Hat 7.2 and later, any recent Ubuntu release).

chrony has several advantages over ntpd:

  • Quicker synchronisation.
  • Better response to changes in clock frequency (very useful for VMs).
  • Periodic polling of time servers isn’t required.

It lacks some features like broadcast, multicast, and Autokey packet authentication. When this is required, or for systems that are going to be switched on continuously ntpd is a better choice.

A more comprehensive comparison list is available here:

https://chrony.tuxfamily.org/comparison.html

Locate a pool or set as close as possible to you from any public ntp servers.

https://www.pool.ntp.org/en/

Setting a chrony NTP server

chrony is installed by default on many distros. If you don’t already have it, install it.

Edit the configuration file.

# vi /etc/chrony.conf

Make the following changes.

# Edit the time sources of your choice
# iburts helps making initial sync faster

server 0.pool.ntp.org iburst
server 1.pool.ntp.org iburst
server 2.pool.ntp.org iburst
server 3.pool.ntp.org iburst

# Helps stabilising initial sync on restarts

driftfile /var/lib/chrony/drift

# Allows serving time even if above sources aren't available

local stratum 8

# Opens the NTP port to respond to client's requests
# Edit it with your client's subnet

allow 192.168.1.0/24

# Enables support for the settime command in chronyc

manual

Start and enable the service.

# systemctl start chronyd

# systemctl enable chronyd

Check the firewall configuration in the last section.

Chrony client configuration

server [IP/HOSTNAME OF ABOVE SERVER] iburst
driftfile /var/lib/chrony/drift
logdir /var/log/chrony
log measurements statistics tracking

Checking chrony

[Check if the service is running]
$ systemctl status chrony

[Display the system's clock performance]
$ chronyc tracking

[Display time sources]
$ chronyc sources

More information on chrony:

https://chrony.tuxfamily.org/faq.html

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/ch-configuring_ntp_using_the_chrony_suite

Setting an ntpd NTP server

Install ntpd for your distro if not already present.

# yum install ntp
# dnf install ntp
# apt install ntp

Syncing to the server’s own system clock

If the system is going to be isolated, with no internet connection, or any other time source available you can use its internal clock.

Edit /etc/ntp.conf.

# To point ntpd to sync with its own system clock
server 127.127.1.0 prefer 
fudge 127.127.1.0
driftfile /etc/ntp.drift
tracefile /etc/ntp.trace

This will work in a network “island”, but it won’t be a correct time. It is best to sync from other time sources (next section).

Syncing to other NTP servers

# Edit the time sources of your choice
# iburts helps making initial sync faster
server 0.pool.ntp.org iburst
server 1.pool.ntp.org iburst
server 2.pool.ntp.org iburst
server 3.pool.ntp.org iburst

# Insert your own subnet address
# nomodify - Disallows clients from configuring the server
# notrap - Clients can't be used as peers for time sync
restrict 192.168.1.0 netmask 255.255.255.0 nomodify notrap

# Indicates where to keep logs
logfile /var/log/ntp.log

Start, enable and check ntpd status:

# systemctl start ntpd
# systemctl enable ntpd
# systemctl status ntpd

Remember that you will need to open your firewall to allow NTP queries. There are some instructions further down.

ntpd client configuration

server [IP/HOSTNAME OF ABOVE SERVER] iburst
driftfile /var/lib/ntp/drift

Checking ntpd

$ ntpq -p
$ date -R

More information on ntpd:

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/ch-configuring_ntp_using_ntpd

Firewall

Remember that you might need to open your firewall for clients to connect to your server.

[Red Hat / CentOS]

# firewall-cmd --add-service=ntp --permanent
# firewall-cmd --reload

[Ubuntu]

# ufw allow ntp

or

# ufw allow 123/udp

You might want to also modify the rule to limit access only to certain subnets or clients.

You can add lines to chrony and ntpd configurations to allow IPv6 traffic. You would need to add also additional firewall rules. IPv4 shown here for simplicity (and also because I don’t have the requirement). 🙂




Raspberry Pi: Installing, hardening and optimising Ubuntu 20.04 Server

I have been trying to document the process of configuring a Raspberry Pi as a Time Machine Capsule, but the article became far too long. It covered far too much information and was really hard to read.

I then decided to break the stages into more manageable steps. This has the advantage of allowing the common stages, like setting up the OS, to be shared between different projects.

Therefore, this is that first entry. Some others will follow about how to build different things from this first base image.

Selecting the OS

The 64-bit beta release of Raspberry Pi OS I tried didn’t let ZFS install easily. Ubuntu has the advantage of being a like for like experience regardless of the platform, so it is my preferred choice. Any experience you gain with it will be easily transferable.

You can download Ubuntu Server images from https://ubuntu.com/download/raspberry-pi. The LTS version is also the preferred one.

The Raspberry Pi model will determine the supported versions of the OS.

Model 32-bit Ubuntu 64-bit Ubuntu
Raspberry Pi 2 Supported Not supported
Raspberry Pi 3 Supported Recommended
Raspberry Pi 4 Supported Recommended
Supported Ubuntu versions.

The Raspberry Pi 3 has limited benefits when using the 64-bit image due to its limited RAM. In addition, it won’t support ZFS for the same reason. The Pi will restart/reset when ZFS volumes are accessed due to a lack of RAM.

If you are going to use a GUI, you should choose a Raspberry Pi 4 with at least 4GB of RAM.

The image can be directly installed on a micro SD card:

# ddrescue -y -c 4Ki ubuntu-20.04.3-preinstalled-server-arm64+raspi.img /dev/sdxx

Installing Ubuntu Server on a USB stick

It is possible to boot from a USB stick, which is preferable for several reasons. They are cheaper, easier to access from another system, and simple to replace.

First, enable USB boot on your Pi.

Model USB Boot Support Notes
Raspberry Pi 1 Not supported n/a
Raspberry Pi 2 and 3B Supported On Raspberry Pi OS echo program_usb_boot_mode=1 | sudo tee -a /boot/config.txt and reboot.
Raspberry Pi 3B+ Supported Supported out of the box
Raspberry Pi 4 Supported On Raspberry Pi OS rpi-eeprom-config --edit and set BOOT_ORDER=0xf41 and reboot.
Raspberry Pi’s with supported USB boot.

You might have to boot from an SD card at least once to configure USB boot. Once enabled, it remains activated.

Additional information about the different boot modes for the Raspberry Pi

The following links are provided for reference.

Raspberry Pi booting from USB mass storage https://www.raspberrypi.org/documentation/computers/raspberry-pi.html#booting-from-usb-mass-storage

Raspberry Pi 4 bootloader configuration https://www.raspberrypi.org/documentation/computers/raspberry-pi.html#raspberry-pi-4-bootloader-configuration

Raspberry Pi 4 boot flow https://www.raspberrypi.org/documentation/computers/raspberry-pi.html#raspberry-pi-4-boot-flow

Configuration steps

Once the Pi has been configured to boot from a USB device, install the image on a USB stick like the SD card.

# ddrescue -y -c 4Ki ubuntu-20.04.3-preinstalled-server-arm64+raspi.img /dev/sdxx

For the image to be bootable, you need to make some changes. I extracted the steps from this Raspberry Pi forum post. You might find it easier to apply changes if you mount it on another system.

There are two options to make the changes:

  • Mount the USB stick on another system, and then issue the commands on the USB device. This other system can be the Raspberry Pi itself booting from the SD card, and accessing the USB device.
  • Or make the changes on the SD card, and then copy the SD card image to the USB device.

Apply the following changes.

1) On the /boot of the USB device, uncompress vmlinuz.

$ cd /media/*/system-boot/
$ zcat vmlinuz > vmlinux

2) Update the config.txt file. The pi4 section is shown in this example, but it has also been tested on a Pi 3. Just enter the information for your Pi model.

$ vim config.txt

The dtoverlay line might be optional for headless systems, but if you have the time and inclination, there is some documentation regarding Raspberry Pi’s device tree parameters.

[pi4]
kernel=vmlinux
max_framebuffers=2
dtoverlay=vc4-fkms-v3d
boot_delay
initramfs initrd.img followkernel

3) Create a script in the boot partition called auto_decompress_kernel with the following content:

#!/bin/bash -e

## Set Variables

BTPATH=/boot/firmware
CKPATH=$BTPATH/vmlinuz
DKPATH=$BTPATH/vmlinux

## Check if compression needs to be done.

if [ -e $BTPATH/check.md5 ]; then
	if md5sum --status --ignore-missing -c $BTPATH/check.md5; then
    	echo -e "\e[32mFiles have not changed, Decompression not needed\e[0m"
	    exit 0
	else
        echo -e "\e[31mHash failed, kernel will be compressed\e[0m"
	fi
fi

# Backup the old decompressed kernel

mv $DKPATH $DKPATH.bak

if [ ! $? == 0 ]; then
	echo -e "\e[31mDECOMPRESSED KERNEL BACKUP FAILED!\e[0m"
	exit 1
else
    echo -e "\e[32mDecompressed kernel backup was successful\e[0m"
fi

#Decompress the new kernel
echo "Decompressing kernel: "$CKPATH".............."

zcat $CKPATH > $DKPATH

if [ ! $? == 0 ]; then
	echo -e "\e[31mKERNEL FAILED TO DECOMPRESS!\e[0m"
	exit 1
else
	echo -e "\e[32mKernel Decompressed Succesfully\e[0m"
fi

# Hash the new kernel for checking
md5sum $CKPATH $DKPATH > $BTPATH/check.md5

if [ ! $? == 0 ]; then
    	echo -e "\e[31mMD5 GENERATION FAILED!\e[0m"
	else
        echo -e "\e[32mMD5 generated Succesfully\e[0m"
fi

# Exit
exit 0

Normally you would need to mark the script as executable, but unless you modify the partition from its FAT32 default, there is no executable flag to set. So leave it as it is.

If you can mount the root filesystem in the system you are using to edit the files, you can go ahead with steps 4 and 5. Otherwise, you should be able to boot now and manually do these steps after your first boot.

4) Create a script in /ect/apt/apt.conf.d/ directory and call it 999_decompress_rpi_kernel

# cd /media/*/writable/etc/apt/apt.conf.d/
# vi 999_decompress_rpi_kernel

Fill the file with the following content:

DPkg::Post-Invoke {"/bin/bash /boot/firmware/auto_decompress_kernel"; };

5) Make the script executable.

# chmod 744 999_decompress_rpi_kernel

You can save yourself some time and configure the network at this stage.

In my case, I have a static DHCP lease associated with the Pi MAC address, but if you don’t, you can configure the network with a static IP address by editing the network-config file in /boot.

$ cd /media/*/boot/
$ vim network-config

An example of a static address entry would be:

version: 2
ethernets:
  eth0:
    dhcp4: no
    addresses: [192.168.1.201/24]
    gateway4: 192.168.1.254
    nameservers:
       addresses: [192.168.1.254]

You can eject the USB drive, insert it on your Raspberry Pi and boot.

Setting up Ubuntu

The default user name and password are ubuntu / ubuntu.

Upon login, you will be asked to change your password. We will delete this user in the following steps to increase security.

Run an update:

$ sudo su
# apt update
# apt upgrade

Setting up users

Create a new user (or change the name of the existing user).

# adduser <newuser>

Extract the groups for the user ubuntu and compare them with the new user.

# id ubuntu ; echo ; id <newuser>

uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),115(netdev),118(lxd)

uid=1001(newuser) gid=1001(newuser) groups=1001(newuser)

Add the new user to the same groups.

# usermod -a -G adm,dialout,cdrom,floppy,sudo,audio,dip,video,plugdev,netdev,lxd newuser

Hostname

Set the hostname of your choice.

# hostnamectl set-hostname <system_name>

[Check the change]

# hostnamectl
   Static hostname: pi-capsule
         Icon name: computer
        Machine ID: db0a1818241a47e178f229294f6864ae
           Boot ID: 983818fbaa8246348066c36f2237636e
  Operating System: Ubuntu 20.04.2 LTS
            Kernel: Linux 5.4.0-1029-raspi
      Architecture: arm64

Date and time

Set the time zone.

# timedatectl set-timezone Europe/London

Configure the time sources by editing /etc/systemd/timesyncd.conf.

[Time]
NTP=uk.pool.ntp.org
FallbackNTP=ntp.ubuntu.com

Restart the service.

# systemctl restart systemd-timesyncd.service

Check the status and check that the time source is correct.

# systemctl status systemd-timesyncd.service

Finally, check that the time zone is correct.

# timedatectl status
               Local time: Sun 2021-08-29 23:24:49 BST
           Universal time: Sun 2021-08-29 22:24:49 UTC
                 RTC time: n/a                        
                Time zone: Europe/London (BST, +0100) 
System clock synchronized: yes                        
              NTP service: active                     
          RTC in local TZ: no

Customising the MOTD

You can get the MOTD from the login screen manually with the following command.

$ for i in /etc/update-motd.d/* ; do if [ "$i" != "/etc/update-motd.d/98-fsck-at-reboot" ]; then $i; fi; done

To get system information (including temperature):

$ /etc/update-motd.d/50-landscape-sysinfo

You can edit, add and reorder scripts in /etc/update-motd.d/.

Configuring SSH

SSH will be enabled by default. Test access with the newly created account.

By default, only the password is required to access the server, but we will add the requirement of needing an SSH key with the password. And also limit access only from authorised IP addresses.

If you haven’t generated a public and private key pair on your system (the one used to log into the Pi), you will need to do it (explained below).

A brief note on encryption. Elliptic curve cryptography (ECC) generates smaller keys and provides faster encryption than non-ECC. The smaller ECC keys also provide an equivalent level of encryption provided only with bigger RSA keys:

ECC key size RSA equivalent
160 bits 1024 bits
224 bits 2048 bits
256 bits 3072 bits
384 bits 7680 bits
512 bits 15360 bits
ECC uses smaller keys with higher equivalent security.

You can use either ECDSA or ED25519 keys. ED25519 isn’t as universally implemented yet due to being quite new, so some clients might not support it, but it is the fastest and most secure one.

For both types of encryption, it is recommended to use the bigger key size. This is 521 bits for ECDSA (note that 521 isn’t a typo). ED25519 keys have a fixed length of 512 bits.

When issuing ssh-keygen, use the -o option. This forces the use of the new OpenSSH format (instead of PEM) when saving your private key. It increases resistance to a known brute-force attack. It breaks compatibility with OpenSSH versions older than 6.5, but this version of Ubuntu runs version 8.2, so this isn’t an issue.

More information about SSH key generation is available here: https://www.ssh.com/ssh/keygen/

The steps are:

Create a suitable key pair with:

$ ssh-keygen -o -t ed25519

[or]

$ ssh-keygen -o -t ecdsa -b 521

Copy the public key to the Ubuntu server. It can be done manually, but it is best to use the appropriate tool:

$ ssh-copy-id -i ~/.ssh/<myprivatekey> <user>@<remotehost>

Note that you use the -i flag with your private key, and ssh-copy-id will send the public key for storage on the remote host.

SSH can be configured on the server side to allow only password logins, only key logins, or to require both.

# vim /etc/ssh/sshd_config

PasswordAuthentication no” will only use the key, and “PasswordAuthentication yes” will use both password and key. Obviously, the second option is safer.

We also disable the option to allow root to login via SSH. The root account is disabled on the image by default, but ensure SSH has been configured correctly anyway.

PermitRootLogin no
PasswordAuthentication yes
# systemctl restart sshd

SSH from another terminal with the new user account, and ensure that the access is working.

If it works, delete the old ubuntu account.

# userdel -r ubuntu

Activate and configure the firewall

Set default rules (deny all incoming, allow all outgoing).

# ufw status

# ufw default allow outgoing

# ufw default deny incoming

UFW requires IPv6 to be enabled. It can be made to work with it disabled, but how to achieve that is out of the scope of this post.

# vim /etc/default/ufw

IPV6=yes

Allow SSH.

# ufw allow ssh

[but preferably allow only specific clients:]

# ufw allow proto tcp from <SOURCE> to <SERVER> port 22

And limit the allowed connection attempts to thwart brute force attacks:

ufw limit ssh

Enable the firewall and check the rules:

# ufw enable

# ufw status

[List rules with numbers]

# ufw status numbered

Remember that if you are using IPv6, you might need to edit rules accordingly.

Install log2ram

To reduce the number of writes on the USB drive/SD card, you can use the RAM disk utility log2ram.

https://github.com/azlux/log2ram

Not only that, it will speed up the performance of the Raspberry Pi in exchange for a small amount of RAM.

Install:

# echo "deb http://packages.azlux.fr/debian/ buster main" | sudo tee /etc/apt/sources.list.d/azlux.list

# wget -qO - https://azlux.fr/repo.gpg.key | sudo apt-key add -

# apt update

# apt install log2ram

Configure the service. The SIZE entry depends on your system; 256M is a lot for a Pi with only 1GB of RAM.

# vim /etc/log2ram.conf

SIZE=256M
USE_RSYNC=true
MAIL=true
PATH_DISK="/var/log"

And restart.

# reboot

Check that the service is working:

$ systemctl status log2ram
$ df -h | grep log2ram
log2ram         256M  106M  151M  42% /var/log

Installing additional utilities

Install your choice of applications.

# apt install mosh tmux pydf vim-nox glances iotop

Mosh might require some ports to be opened in the firewall.

The range of ports goes from 60001 to 60999, but if you are expecting few connections, you can make the range smaller.

# ufw allow proto udp from <SOURCE> to <SERVER> port 60001:60010

# ufw limit 60001:60010/udp

Install Cockpit

# apt install -y cockpit
# ufw allow proto tcp from <SOURCE< to <SERVER> port 9090

# ufw limit 9090/tcp

The system can now be reached via the web browser via port 9090:

https://<hostname/IP>:9090

Other customisation

Argon Fan HAT configuration

If you have an Argon fan HAT, you can configure it as follows.

$ curl https://download.argon40.com/argonfanhat.sh -o argonfanhat.sh
$ bash argonfanhat.sh
[...]
Use argonone-config to configure fan
Use argonone-uninstall to uninstall

I have configured with the following triggers.

  • 30 ºC -> 0%
  • 60 ºC -> 10%
  • 65 ºC -> 25%
  • 70 ºC -> 55%
  • 75 ºC -> 100%

Aliases

On Ubuntu, and most distros, there will be an entry in ~/.bashrc that will look like this:

if [ -f ~/.bash_aliases ]; then
. ~/.bash_aliases
fi

This entry can be added manually if not present. This allows all of the aliases to be grouped in ~/.bash_aliases.

$ vim ~/.bash_aliases
# Show free RAM
alias showfreeram="free -m | sed -n '2 p' | awk '{print $4}'"

# Release and free up RAM
# alias freeram='freeram && sync && sudo echo 3 | sudo tee /proc/sys/vm/drop_caches && freeram -m'

# Show temperature
alias temp='cat /sys/class/thermal/thermal_zone0/temp | head -c -4 && echo " C"'

# Show ZFS datasets compress ratios
alias ratio='sudo zfs get all | grep " compressratio "'

This would create a base image with a decent level of security. I will likely add how to add Fail2Ban to improve security even further.




Ubuntu: Installing/fixing TP-Link AC1200 (T4UH 1.0) drivers in Ubuntu 20.04 LTS

I wrote an entry about this adapter and Ubuntu 18.04.

This week my 20.04 LTS installation started to freeze randomly. I suspected several things, but through a process of elimination it ended up pointing to the Wi-Fi adapter.

I can’t rule out a hardware issue yet, but the new driver has been very stable and no freezes have happened so far. This started happening after the last Ubuntu upgrade I ran, and to be fair, the Wi-Fi adapter’s DKMS driver I was using was quite dated.

First check the hardware

Unplug and re-plug the adapter, remember that it will only work on USB 3.0 ports, and it won’t be recognised by USB 3.1 ports. Check the output of:

$ dmesg

The following commands will also help in showing if the adapter is correctly detected.

$ lsusb
Bus 004 Device 002: ID 4791:205a G-Technology ArmorATD
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 002: ID 2357:0103 TP-Link Archer T4UH wireless Realtek 8812AU
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

The Bus 003 Device 002: ID 2357:0103 entry above is the one with the USB Wi-Fi adapter on my system. You can remove the adapter and issue the command again and compare results to help you identify yours.

For non-USB adapters you can use:

$ lspci

More detailed information about the device can be obtained with the lshw command.

$ lshw -C network
WARNING: you should run this program as super-user.
  *-network                 
       description: Ethernet interface
       product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
       vendor: Realtek Semiconductor Co., Ltd.
       [output truncated]
    *-network:2
       description: Wireless interface
       physical id: 4
       bus info: usb@3:1
       logical name: enx18d6c70fbacc
       serial: 18:d6:c7:a1:22:ab
       capabilities: ethernet physical wireless
       configuration: broadcast=yes driver=rtl8812au ip=192.168.x.2 multicast=yes wireless=IEEE 802.11AC
WARNING: output may be incomplete or inaccurate, you should run this program as super-user.

This last command is really useful because it will give really important information about what driver to use.

In this case the chipset and driver to use is identified in this string driver=rtl8812au. We already knew this in any case. If yours is a different driver/adapter this solution is unlikely to work for you.

Checking the drivers

Now check that the driver is loaded, you need to look for a string that is similar to the driver string above.

$ lsmod | grep 8812
8812au                999424  0

If the module isn’t loaded you can manually load it:

# modprobe 8812au

Installing updated drivers

If all of the above seems to work but the Wi-Fi adapter isn’t detected you can install the drivers manually.

The version of the drivers is newer than the ones provided via apt.

Uninstall the system provided drivers

From the GUI:

  • Go to Software & Updates
  • Select Additional Drivers
  • Find the entry for the Wi-Fi adapter (rtl8812-au) and select Do not use the device

Or from the CLI:

[find the installed driver]

# apt list rtl8812au*

[and uninstall it]

# apt purge rtl8812au-dkms

Install alternative driver

Get the updated drivers from github:

$ git clone https://github.com/gordboy/rtl8812au-5.9.3.2

Move the source code to /usr/src so that DKMS can automatically build the driver when the kernel is updated.

# mv rtl8812au-5.9.3.2/ /usr/src/

Build and install the drivers:

# dkms add -m rtl8812au -v 5.9.3.2
# dkms build -m rtl8812au -v 5.9.3.2
# dkms install -m rtl8812au -v 5.9.3.2

Check that the driver is installed correctly:

# dkms status

Additionally:

[Make sure that in /etc/NetworkManager/NetworkManager.conf]

# vim /etc/NetworkManager/NetworkManager.conf

[The following entry is inserted]

[device]
wifi.scan-rand-mac-address=no

If the driver is recognised you can configure the wireless network as normal. Restart to make sure everything works and remains persistent.

Uninstall

If you ever need to uninstall the driver you can do it with:

# dkms remove -m rtl8812au -v 5.9.3.2 --all

If you edited /etc/modules you will need to revert the changes. In the previous tutorial for Ubuntu 18.04 the module had to be added manually. It isn’t the case for this version.