Ubuntu 22.04 – Migrating from Firefox snap to Firefox apt

Using snaps might have its advantages, but the amount of RAM and CPU cycles that Firefox seems to take made me want to switch. The browser certainly feels more responsive.

Remove the Firefox snap.

# snap remove firefox

If you don't have an APT keyring create one, import the Mozilla APT repository signing key and add it to your sources.

# install -d -m 0755 /etc/apt/keyrings

$ wget -q https://packages.mozilla.org/apt/repo-signing-key.gpg -O- | sudo tee /etc/apt/keyrings/packages.mozilla.org.asc > /dev/null

$ echo "deb [signed-by=/etc/apt/keyrings/packages.mozilla.org.asc] https://packages.mozilla.org/apt mozilla main" | sudo tee -a /etc/apt/sources.list.d/mozilla.list > /dev/null

You need to change the Firefox apt priority to avoid the snap version being re-installed.

$ echo '
Package: *
Pin: origin packages.mozilla.org
Pin-Priority: 1000
' | sudo tee /etc/apt/preferences.d/mozilla

And install the Firefox apt.

# apt update && sudo apt install firefox

[Install a localised version if you need it or want it]

# apt install firefox-l10n-gb

When you launch Firefox it will now be the apt version. Remember to move/copy your profiles from the snap to the new version.

$ cp -a ~/snap/firefox/common/.mozilla/firefox/* ~/.mozilla/firefox/

Finally, launch Firefox and point to the right profile (and delete any you don’t want to keep).

This is best done using the Profile Manager (about:profiles):

https://support.mozilla.org/en-US/kb/profile-manager-create-remove-switch-firefox-profiles

ZFS: Setting up ZFS storage on Ubuntu

If you are new to ZFS, I would advise doing a little bit of research first to understand the fundamentals. Jim Salter’s articles on storage and ZFS are very recommended.

https://arstechnica.com/information-technology/2020/05/zfs-101-understanding-zfs-storage-and-performance/

The examples below are to create a pool from a single disk, with separate datasets used for network backups.

In some examples, I might use device names for simplicity, but you are advised to use disks IDs or serials.

Installing ZFS

Ubuntu makes it very easy.

# apt install zfsutils-linux

ZFS Cockpit module

If Cockpit is installed, it is possible to install a module for ZFS. This module is sadly no longer in development. If you know of alternatives, please share!

$ git clone https://github.com/optimans/cockpit-zfs-manager.git
[...]
# cp -r cockpit-zfs-manager/zfs /usr/share/cockpit

Configuring automatic snapshots

This service generates automatic snapshots every hour, and it can be configured to retain your preferred period.

# apt install zfs-auto-snapshot

The snapshot retention is set in the following files:

/etc/cron.hourly/zfs-auto-snapshot
/etc/cron.daily/zfs-auto-snapshot
/etc/cron.weekly/zfs-auto-snapshot
/etc/cron.monthly/zfs-auto-snapshot

By default, the configuration runs the following snapshots and retention policies:

Period	Retention
Hourly	24 hours
Daily	31 days
Weekly	Eight weeks
Monthly	12 months

I configured the following snapshot retention policy:

Period	Retention
Hourly	48 hours
Daily	14 days
Weekly	Four weeks
Monthly	Three months

Hourly

# vim /etc/cron.hourly/zfs-auto-snapshot

#!/bin/sh

# Only call zfs-auto-snapshot if it's available
which zfs-auto-snapshot > /dev/null || exit 0

exec zfs-auto-snapshot --quiet --syslog --label=hourly --keep=48 //

Daily

# vim /etc/cron.daily/zfs-auto-snapshot

#!/bin/sh

# Only call zfs-auto-snapshot if it's available
which zfs-auto-snapshot > /dev/null || exit 0

exec zfs-auto-snapshot --quiet --syslog --label=daily --keep=14 //

Weekly

# vim /etc/cron.weekly/zfs-auto-snapshot

#!/bin/sh

# Only call zfs-auto-snapshot if it's available
which zfs-auto-snapshot > /dev/null || exit 0

exec zfs-auto-snapshot --quiet --syslog --label=weekly --keep=4 //

Monthly

# vim /etc/cron.monthly/zfs-auto-snapshot

#!/bin/sh

# Only call zfs-auto-snapshot if it's available
which zfs-auto-snapshot > /dev/null || exit 0

exec zfs-auto-snapshot --quiet --syslog --label=monthly --keep=3 //

Setting up the ZFS pool

This post has several use cases and examples, and I recommend it highly if you want further details on different commands and ways to configure your pools.

https://www.thegeekdiary.com/zfs-tutorials-creating-zfs-pools-and-file-systems/

In my example there is no resilience, as there is only one attached disk. For me, this is acceptable because I have an additional local backup besides this filesystem.

It is preferable to have a second backup (ideally off-site) than a single one regardless of any added resilience you might set.

I create a single pool with an external drive. Read below for an explanation of the different command flags.

zpool create -f 
-o ashift=12 
-O compression=lz4 
-O acltype=posixacl 
-O xattr=sa 
-O relatime=on 
-O atime=off 
-O normalization=formD 
-O canmount=off 
-O dnodesize=auto 
-O sync=standard 
backup_pool scsi-SSeagate_Desktop_NA7HP4VK

Block size / ashift

Of the above values, the most important one by far is ashift.

The ashift property sets the block size of the vdev. It can’t be changed once set, and if it isn’t correct, it will cause massive performance issues with the filesystem.

Find out your drive’s optimal block size and match it to ashift.

It is set in bits.

bits	sector size
9	512 bytes
10	1 kiB
11	2 kiB
12	4 kiB
13	8 kiB
14	16 kiB
15	32 kiB
16	64 kiB

recordsize is another performance impacting property, especially on the Raspberry Pi. Smaller sizes can improve performance when accessing random batches, but higher values will provide better performance and compression when reading sequential data. The problem on the Raspberry Pi has been that with a value of 1M the system load increased, eventually stopping the filesystem activity until the system was restarted.

The default value (128k) has performed without any noticeable issue.

Compression

lz4 compression is going to yield an optimum performance/compression ratio. It will make the storage perform faster than if there is no compression.

ZFS 0.8 doesn’t give many choices regarding compression but bear in mind that you can change the algorithm on a live system.

gzip will impact performance but yields a higher compression rate. It might be worth checking the performance with different compression formats on the Pi 4. With older Raspberry Pi models, the limitation will be the USB / network in most cases.

For reference, on the same amount of data these were the compression ratios I obtained:

gzip-7
backup_pool 1.34x
backup_pool/backintime 1.35x
backup_pool/timecapsule 1.33x

lz4
backup_pool 1.27x
backup_pool/backintime 1.30x
backup_pool/timecapsule 1.33x

All in all, the performance impact and memory consumption didn’t make switching from lz4 worthwhile.

Permissions

acltype=posixacl
xattr=sa

It enables the POSIX ACLs and Linux Extended Attributes on the inodes rather than on separate files.

Access times

atime is recommended to be disabled (off) to reduce the number of IOPS.

relatime offers a good compromise between the atime and notime behaviours.

Normalisation

The normalization property indicates whether a file system should perform a Unicode normalisation of file names whenever two file names are compared and which normalisation algorithm should be used.

formD is the default set by Canonical when setting up a pool. It seems to be a good choice if sharing the volume via NFS with macOS systems and avoiding files not being displayed due to names using non-ASCII characters.

Additional properties

The pool is configured with the canmount property off so that it can’t be mounted.

This is because I will be creating separate datasets, one for Time Capsule backups, and another two for Backintime, and I don’t want them to mix.

All datasets will share the same pool, but I don’t want the pool root to be mounted. Only datasets will mount.

dnodesize is set to auto, as per several recommendations when datasets are using the xattr=sa property.

sync is set as standard. There is a performance hit for writes, but disabling it comes at the expense of data consistency if there is a power cut or similar.

A brief test showed a lower system load when sync=standard than with sync=disabled. Also, with standard there were fewer spikes. It is likely that the performance is lower, but it certainly causes the system to suffer less.

Encryption

I am not too keen to encrypt physically secure volumes because when doing data recovery, you are adding an additional layer that might hamper and slow things down.

For reference, I am writing down an example of encryption options using an external key for a volume. This might not be appropriate for your particular scenario. Research alternatives if needed.

-O encryption=aes-256-gcm 
-O keylocation=file:///etc/pool_encryption_key 
-O keyformat=raw

Pool options

Automatic trimming of the pool is essential for SSDs:

# zpool set autotrim=on backup_pool

Disabling automatic mount for the pool. (This applies only to the root of the pool, the datasets can still be set to be mountable regardless of this setting.)

# zfs set canmount=off backup_pool

Setting up the ZFS datasets

I will create three separate datasets with assigned quotas for each.

[Create datasets]
# zfs create backup_pool/backintime_tuxedo
# zfs create backup_pool/backintime_ab350
# zfs create backup_pool/timecapsule

[Set mountpoints]
# zfs set mountpoint=/backups/backintime_tuxedo  backup_pool/backintime_tuxedo
# zfs set mountpoint=/backups/backintime_ab350  backup_pool/backintime_ab350
# zfs set mountpoint=/backups/timecapsule  backup_pool/timecapsule

[Set quotas]
# zfs set quota=2T backup_pool/backintime_tuxedo
# zfs set quota=2T backup_pool/backintime_ab350
# zfs set quota=2T backup_pool/timecapsule

Changing compression on a dataset

The default lz4 compression is recommended. gzip consumes a lot of CPU and makes data transfers slower, impacting backups restoration.

If you still want to change the compression for a given dataset:

# zfs set compression=gzip-7 backup_pool/timecapsule

A comparison of compression and decompression using different algorithms with OpenZFS:

https://github.com/openzfs/zfs/pull/9735

Querying pool properties, current compression algorithm and compress ratio

# zfs get all backup_pool
# zfs get compression backup_pool
# zfs get compressratio backup_pool
# zfs get all | grep compressratio

Changing ZFS settings

For reference, below are some examples of properties and settings that can be changed after a pool has already been created.

Renaming pools and datasets

If for any reason, a dataset was given a name that needs to be changed, this can be done with a command like this:

# zfs rename backup_pool/Test1 backup_pool/backintime_tuxedo

A zpool can be renamed by exporting and importing it.

# zpool export test_pool
# zpool import test_pool backup_pool

Attaching mirror disks

You can add an additional disk/partition and make the pool redundant in a RAID-Z configuration. Unfortunately, it doesn’t work to make it a RAID-Z2 or RAID-Z3.

# zpool attach backup_pool /dev/sda7 /dev/sdb7

Renaming disks in pools

By default, Ubuntu uses device identifiers for the disks. This should not be an issue, but in some cases, adding or connecting drives might change the device name order and degrade one or more pools.

This is why creating a pool with disk IDs or serials is recommended. You can still fix this if you created your pool using device names.

With the pool unmounted, export it, and reimport pointing to the right path:

# zpool export backup_pool
# zpool import -d /dev/disk/by-id/ backup_pool

There are additional examples in this handy blog post:

https://plantroon.com/changing-disk-identifiers-in-zpool/

ZFS optimisation

ZFS should be running on a system with at least 4GiB of RAM. If you plan to use it on a Raspberry Pi (or any other system with limited resources), reduce the ARC size.

In this case, I am limiting it to 3GiB. It is a change that can be done live:

# echo 3221225472 > /sys/module/zfs/parameters/zfs_arc_max

To make it persistent between boots:

# vim /etc/modprobe.d/zfs.conf

[add this line]
options zfs zfs_arc_max=3221225472

# update-initramfs -u

You can check the ARC statistics:

$ less /proc/spl/kstat/zfs/arcstats

More on ZFS performance

Some other links with interesting points on performance:

https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Workload%20Tuning.html

https://icesquare.com/wordpress/how-to-improve-zfs-performance/

Linux: Initiating a CPU backtrace

Sometimes a process is taking a lot of CPU time and it isn’t clear what the cause is.

For example, at times it is common to see the kworker process consuming a lot of CPU. kworker is a placeholder process for kernel worker threads. These threads perform most of the actual processing for the kernel and you might want to see what device is involved.

You can run a CPU backtrace in Linux that records in dmesg what each one of the CPUs in the system are doing. This can be very useful to determine what specific process is hogging the CPU and in some cases to which module/driver it is related.

This is done using the magic SysRq key. This is a key combination that allows you to communicate directly with the kernel and perform several low level commands regardless of the state of the system. An exception would be a kernel panic, for obvious reasons.

When the magic SysRq key is enabled you can use the key combination Alt+SysRq+command key. There are many options and worth writing a future article just on it. The Wikipedia article explains some of them.

Because the magic SysRq key provides direct access to the kernel and the deep security implications of this it is disabled by default; if not in all, in most distros.

When the magic SysRq key is disabled and a backtrace is requested dmesg will not display any backtrace. So you can temporarily activate the magic SysRq key with:

# sysctl -w kernel.sysrq=1

or

# echo 1 > /proc/sys/kernel/sysrq

Be aware that this won’t persist between reboots. To turn it off you use:

# sysctl -w kernel.sysrq=0

or

# echo 0 > /proc/sys/kernel/sysrq

You can generate the backtrace with the Alt-SysRq-L key combination . If you need to script the backtrace or are accessing the system remotely, there is a CLI alternative to do the same:

# echo l > /proc/sysrq-trigger

And you can then check the results with:

$ dmesg

[ 3966.375451] Call Trace:
[ 3966.375463]  dump_stack+0x63/0x8b
[ 3966.375468]  nmi_cpu_backtrace+0x94/0xa0
[ 3966.375473]  ? lapic_can_unplug_cpu+0xb0/0xb0
[ 3966.375478]  nmi_trigger_cpumask_backtrace+0xe6/0x130
[ 3966.375482]  arch_trigger_cpumask_backtrace+0x19/0x20
[ 3966.375487]  sysrq_handle_showallcpus+0x17/0x20
[ 3966.375491]  __handle_sysrq+0x9f/0x170
[ 3966.375495]  write_sysrq_trigger+0x34/0x40
[ 3966.375500]  proc_reg_write+0x45/0x70
[ 3966.375504]  __vfs_write+0x1b/0x40
[ 3966.375508]  vfs_write+0xb1/0x1a0
[ 3966.375511]  SyS_write+0x55/0xc0
[ 3966.375517]  do_syscall_64+0x73/0x130
[ 3966.375521]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 3966.375525] RIP: 0033:0x7feca7544154
[ 3966.375528] RSP: 002b:00007ffedf5764b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3966.375532] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007feca7544154
[ 3966.375535] RDX: 0000000000000002 RSI: 000055c1c5642320 RDI: 0000000000000001
[ 3966.375537] RBP: 000055c1c5642320 R08: 000000000000000a R09: 0000000000000001
[ 3966.375539] R10: 000000000000000a R11: 0000000000000246 R12: 00007feca7820760
[ 3966.375542] R13: 0000000000000002 R14: 00007feca781c2a0 R15: 00007feca781b760
[ 3966.375547] Sending NMI from CPU 5 to CPUs 0-4,6-15:
[ 3966.375569] NMI backtrace for cpu 13 skipped: idling at acpi_idle_do_entry+0x19/0x40
[ 3966.375574] NMI backtrace for cpu 12 skipped: idling at acpi_idle_do_entry+0x19/0x40
[...]

The above example is from an idle system, but if the system was busier it would display more activity from other CPUs, system calls and what driver/module is involved.

If the same module, driver or hardware device keeps showing up in the backtraces you should check them as possible source of the high CPU utilisation.