Raspberry Pi: Installing, hardening and optimising Ubuntu 20.04 Server

I have been trying to document the process of configuring a Raspberry Pi as a Time Machine Capsule, but the article became far too long. It covered far too much information and was really hard to read.

I then decided to break the stages into more manageable steps. This has the advantage of allowing the common stages, like setting up the OS, to be shared between different projects.

Therefore, this is that first entry. Some others will follow about how to build different things from this first base image.

Selecting the OS

The 64-bit beta release of Raspberry Pi OS I tried didn’t let ZFS install easily. Ubuntu has the advantage of being a like for like experience regardless of the platform, so it is my preferred choice. Any experience you gain with it will be easily transferable.

You can download Ubuntu Server images from https://ubuntu.com/download/raspberry-pi. The LTS version is also the preferred one.

The Raspberry Pi model will determine the supported versions of the OS.

Model 32-bit Ubuntu 64-bit Ubuntu
Raspberry Pi 2 Supported Not supported
Raspberry Pi 3 Supported Recommended
Raspberry Pi 4 Supported Recommended
Supported Ubuntu versions.

The Raspberry Pi 3 has limited benefits when using the 64-bit image due to its limited RAM. In addition, it won’t support ZFS for the same reason. The Pi will restart/reset when ZFS volumes are accessed due to a lack of RAM.

If you are going to use a GUI, you should choose a Raspberry Pi 4 with at least 4GB of RAM.

The image can be directly installed on a micro SD card:

# ddrescue -y -c 4Ki ubuntu-20.04.3-preinstalled-server-arm64+raspi.img /dev/sdxx

Installing Ubuntu Server on a USB stick

It is possible to boot from a USB stick, which is preferable for several reasons. They are cheaper, easier to access from another system, and simple to replace.

First, enable USB boot on your Pi.

Model USB Boot Support Notes
Raspberry Pi 1 Not supported n/a
Raspberry Pi 2 and 3B Supported On Raspberry Pi OS echo program_usb_boot_mode=1 | sudo tee -a /boot/config.txt and reboot.
Raspberry Pi 3B+ Supported Supported out of the box
Raspberry Pi 4 Supported On Raspberry Pi OS rpi-eeprom-config --edit and set BOOT_ORDER=0xf41 and reboot.
Raspberry Pi’s with supported USB boot.

You might have to boot from an SD card at least once to configure USB boot. Once enabled, it remains activated.

Additional information about the different boot modes for the Raspberry Pi

The following links are provided for reference.

Raspberry Pi booting from USB mass storage https://www.raspberrypi.org/documentation/computers/raspberry-pi.html#booting-from-usb-mass-storage

Raspberry Pi 4 bootloader configuration https://www.raspberrypi.org/documentation/computers/raspberry-pi.html#raspberry-pi-4-bootloader-configuration

Raspberry Pi 4 boot flow https://www.raspberrypi.org/documentation/computers/raspberry-pi.html#raspberry-pi-4-boot-flow

Configuration steps

Once the Pi has been configured to boot from a USB device, install the image on a USB stick like the SD card.

# ddrescue -y -c 4Ki ubuntu-20.04.3-preinstalled-server-arm64+raspi.img /dev/sdxx

For the image to be bootable, you need to make some changes. I extracted the steps from this Raspberry Pi forum post. You might find it easier to apply changes if you mount it on another system.

There are two options to make the changes:

  • Mount the USB stick on another system, and then issue the commands on the USB device. This other system can be the Raspberry Pi itself booting from the SD card, and accessing the USB device.
  • Or make the changes on the SD card, and then copy the SD card image to the USB device.

Apply the following changes.

1) On the /boot of the USB device, uncompress vmlinuz.

$ cd /media/*/system-boot/
$ zcat vmlinuz > vmlinux

2) Update the config.txt file. The pi4 section is shown in this example, but it has also been tested on a Pi 3. Just enter the information for your Pi model.

$ vim config.txt

The dtoverlay line might be optional for headless systems, but if you have the time and inclination, there is some documentation regarding Raspberry Pi’s device tree parameters.

[pi4]
kernel=vmlinux
max_framebuffers=2
dtoverlay=vc4-fkms-v3d
boot_delay
initramfs initrd.img followkernel

3) Create a script in the boot partition called auto_decompress_kernel with the following content:

#!/bin/bash -e

## Set Variables

BTPATH=/boot/firmware
CKPATH=$BTPATH/vmlinuz
DKPATH=$BTPATH/vmlinux

## Check if compression needs to be done.

if [ -e $BTPATH/check.md5 ]; then
	if md5sum --status --ignore-missing -c $BTPATH/check.md5; then
    	echo -e "\e[32mFiles have not changed, Decompression not needed\e[0m"
	    exit 0
	else
        echo -e "\e[31mHash failed, kernel will be compressed\e[0m"
	fi
fi

# Backup the old decompressed kernel

mv $DKPATH $DKPATH.bak

if [ ! $? == 0 ]; then
	echo -e "\e[31mDECOMPRESSED KERNEL BACKUP FAILED!\e[0m"
	exit 1
else
    echo -e "\e[32mDecompressed kernel backup was successful\e[0m"
fi

#Decompress the new kernel
echo "Decompressing kernel: "$CKPATH".............."

zcat $CKPATH > $DKPATH

if [ ! $? == 0 ]; then
	echo -e "\e[31mKERNEL FAILED TO DECOMPRESS!\e[0m"
	exit 1
else
	echo -e "\e[32mKernel Decompressed Succesfully\e[0m"
fi

# Hash the new kernel for checking
md5sum $CKPATH $DKPATH > $BTPATH/check.md5

if [ ! $? == 0 ]; then
    	echo -e "\e[31mMD5 GENERATION FAILED!\e[0m"
	else
        echo -e "\e[32mMD5 generated Succesfully\e[0m"
fi

# Exit
exit 0

Normally you would need to mark the script as executable, but unless you modify the partition from its FAT32 default, there is no executable flag to set. So leave it as it is.

If you can mount the root filesystem in the system you are using to edit the files, you can go ahead with steps 4 and 5. Otherwise, you should be able to boot now and manually do these steps after your first boot.

4) Create a script in /ect/apt/apt.conf.d/ directory and call it 999_decompress_rpi_kernel

# cd /media/*/writable/etc/apt/apt.conf.d/
# vi 999_decompress_rpi_kernel

Fill the file with the following content:

DPkg::Post-Invoke {"/bin/bash /boot/firmware/auto_decompress_kernel"; };

5) Make the script executable.

# chmod 744 999_decompress_rpi_kernel

You can save yourself some time and configure the network at this stage.

In my case, I have a static DHCP lease associated with the Pi MAC address, but if you don’t, you can configure the network with a static IP address by editing the network-config file in /boot.

$ cd /media/*/boot/
$ vim network-config

An example of a static address entry would be:

version: 2
ethernets:
  eth0:
    dhcp4: no
    addresses: [192.168.1.201/24]
    gateway4: 192.168.1.254
    nameservers:
       addresses: [192.168.1.254]

You can eject the USB drive, insert it on your Raspberry Pi and boot.

Setting up Ubuntu

The default user name and password are ubuntu / ubuntu.

Upon login, you will be asked to change your password. We will delete this user in the following steps to increase security.

Run an update:

$ sudo su
# apt update
# apt upgrade

Setting up users

Create a new user (or change the name of the existing user).

# adduser <newuser>

Extract the groups for the user ubuntu and compare them with the new user.

# id ubuntu ; echo ; id <newuser>

uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),115(netdev),118(lxd)

uid=1001(newuser) gid=1001(newuser) groups=1001(newuser)

Add the new user to the same groups.

# usermod -a -G adm,dialout,cdrom,floppy,sudo,audio,dip,video,plugdev,netdev,lxd newuser

Hostname

Set the hostname of your choice.

# hostnamectl set-hostname <system_name>

[Check the change]

# hostnamectl
   Static hostname: pi-capsule
         Icon name: computer
        Machine ID: db0a1818241a47e178f229294f6864ae
           Boot ID: 983818fbaa8246348066c36f2237636e
  Operating System: Ubuntu 20.04.2 LTS
            Kernel: Linux 5.4.0-1029-raspi
      Architecture: arm64

Date and time

Set the time zone.

# timedatectl set-timezone Europe/London

Configure the time sources by editing /etc/systemd/timesyncd.conf.

[Time]
NTP=uk.pool.ntp.org
FallbackNTP=ntp.ubuntu.com

Restart the service.

# systemctl restart systemd-timesyncd.service

Check the status and check that the time source is correct.

# systemctl status systemd-timesyncd.service

Finally, check that the time zone is correct.

# timedatectl status
               Local time: Sun 2021-08-29 23:24:49 BST
           Universal time: Sun 2021-08-29 22:24:49 UTC
                 RTC time: n/a                        
                Time zone: Europe/London (BST, +0100) 
System clock synchronized: yes                        
              NTP service: active                     
          RTC in local TZ: no

Customising the MOTD

You can get the MOTD from the login screen manually with the following command.

$ for i in /etc/update-motd.d/* ; do if [ "$i" != "/etc/update-motd.d/98-fsck-at-reboot" ]; then $i; fi; done

To get system information (including temperature):

$ /etc/update-motd.d/50-landscape-sysinfo

You can edit, add and reorder scripts in /etc/update-motd.d/.

Configuring SSH

SSH will be enabled by default. Test access with the newly created account.

By default, only the password is required to access the server, but we will add the requirement of needing an SSH key with the password. And also limit access only from authorised IP addresses.

If you haven’t generated a public and private key pair on your system (the one used to log into the Pi), you will need to do it (explained below).

A brief note on encryption. Elliptic curve cryptography (ECC) generates smaller keys and provides faster encryption than non-ECC. The smaller ECC keys also provide an equivalent level of encryption provided only with bigger RSA keys:

ECC key size RSA equivalent
160 bits 1024 bits
224 bits 2048 bits
256 bits 3072 bits
384 bits 7680 bits
512 bits 15360 bits
ECC uses smaller keys with higher equivalent security.

You can use either ECDSA or ED25519 keys. ED25519 isn’t as universally implemented yet due to being quite new, so some clients might not support it, but it is the fastest and most secure one.

For both types of encryption, it is recommended to use the bigger key size. This is 521 bits for ECDSA (note that 521 isn’t a typo). ED25519 keys have a fixed length of 512 bits.

When issuing ssh-keygen, use the -o option. This forces the use of the new OpenSSH format (instead of PEM) when saving your private key. It increases resistance to a known brute-force attack. It breaks compatibility with OpenSSH versions older than 6.5, but this version of Ubuntu runs version 8.2, so this isn’t an issue.

More information about SSH key generation is available here: https://www.ssh.com/ssh/keygen/

The steps are:

Create a suitable key pair with:

$ ssh-keygen -o -t ed25519

[or]

$ ssh-keygen -o -t ecdsa -b 521

Copy the public key to the Ubuntu server. It can be done manually, but it is best to use the appropriate tool:

$ ssh-copy-id -i ~/.ssh/<myprivatekey> <user>@<remotehost>

Note that you use the -i flag with your private key, and ssh-copy-id will send the public key for storage on the remote host.

SSH can be configured on the server side to allow only password logins, only key logins, or to require both.

# vim /etc/ssh/sshd_config

PasswordAuthentication no” will only use the key, and “PasswordAuthentication yes” will use both password and key. Obviously, the second option is safer.

We also disable the option to allow root to login via SSH. The root account is disabled on the image by default, but ensure SSH has been configured correctly anyway.

PermitRootLogin no
PasswordAuthentication yes
# systemctl restart sshd

SSH from another terminal with the new user account, and ensure that the access is working.

If it works, delete the old ubuntu account.

# userdel -r ubuntu

Activate and configure the firewall

Set default rules (deny all incoming, allow all outgoing).

# ufw status

# ufw default allow outgoing

# ufw default deny incoming

UFW requires IPv6 to be enabled. It can be made to work with it disabled, but how to achieve that is out of the scope of this post.

# vim /etc/default/ufw

IPV6=yes

Allow SSH.

# ufw allow ssh

[but preferably allow only specific clients:]

# ufw allow proto tcp from <SOURCE> to <SERVER> port 22

And limit the allowed connection attempts to thwart brute force attacks:

ufw limit ssh

Enable the firewall and check the rules:

# ufw enable

# ufw status

[List rules with numbers]

# ufw status numbered

Remember that if you are using IPv6, you might need to edit rules accordingly.

Install log2ram

To reduce the number of writes on the USB drive/SD card, you can use the RAM disk utility log2ram.

https://github.com/azlux/log2ram

Not only that, it will speed up the performance of the Raspberry Pi in exchange for a small amount of RAM.

Install:

# echo "deb http://packages.azlux.fr/debian/ buster main" | sudo tee /etc/apt/sources.list.d/azlux.list

# wget -qO - https://azlux.fr/repo.gpg.key | sudo apt-key add -

# apt update

# apt install log2ram

Configure the service. The SIZE entry depends on your system; 256M is a lot for a Pi with only 1GB of RAM.

# vim /etc/log2ram.conf

SIZE=256M
USE_RSYNC=true
MAIL=true
PATH_DISK="/var/log"

And restart.

# reboot

Check that the service is working:

$ systemctl status log2ram
$ df -h | grep log2ram
log2ram         256M  106M  151M  42% /var/log

Installing additional utilities

Install your choice of applications.

# apt install mosh tmux pydf vim-nox glances iotop

Mosh might require some ports to be opened in the firewall.

The range of ports goes from 60001 to 60999, but if you are expecting few connections, you can make the range smaller.

# ufw allow proto udp from <SOURCE> to <SERVER> port 60001:60010

# ufw limit 60001:60010/udp

Install Cockpit

# apt install -y cockpit
# ufw allow proto tcp from <SOURCE< to <SERVER> port 9090

# ufw limit 9090/tcp

The system can now be reached via the web browser via port 9090:

https://<hostname/IP>:9090

Other customisation

Argon Fan HAT configuration

If you have an Argon fan HAT, you can configure it as follows.

$ curl https://download.argon40.com/argonfanhat.sh -o argonfanhat.sh
$ bash argonfanhat.sh
[...]
Use argonone-config to configure fan
Use argonone-uninstall to uninstall

I have configured with the following triggers.

  • 30 ºC -> 0%
  • 60 ºC -> 10%
  • 65 ºC -> 25%
  • 70 ºC -> 55%
  • 75 ºC -> 100%

Aliases

On Ubuntu, and most distros, there will be an entry in ~/.bashrc that will look like this:

if [ -f ~/.bash_aliases ]; then
. ~/.bash_aliases
fi

This entry can be added manually if not present. This allows all of the aliases to be grouped in ~/.bash_aliases.

$ vim ~/.bash_aliases
# Show free RAM
alias showfreeram="free -m | sed -n '2 p' | awk '{print $4}'"

# Release and free up RAM
# alias freeram='freeram && sync && sudo echo 3 | sudo tee /proc/sys/vm/drop_caches && freeram -m'

# Show temperature
alias temp='cat /sys/class/thermal/thermal_zone0/temp | head -c -4 && echo " C"'

# Show ZFS datasets compress ratios
alias ratio='sudo zfs get all | grep " compressratio "'

This would create a base image with a decent level of security. I will likely add how to add Fail2Ban to improve security even further.




Ubuntu: Installing/fixing TP-Link AC1200 (T4UH 1.0) drivers in Ubuntu 20.04 LTS

I wrote an entry about this adapter and Ubuntu 18.04.

This week my 20.04 LTS installation started to freeze randomly. I suspected several things, but through a process of elimination it ended up pointing to the Wi-Fi adapter.

I can’t rule out a hardware issue yet, but the new driver has been very stable and no freezes have happened so far. This started happening after the last Ubuntu upgrade I ran, and to be fair, the Wi-Fi adapter’s DKMS driver I was using was quite dated.

First check the hardware

Unplug and re-plug the adapter, remember that it will only work on USB 3.0 ports, and it won’t be recognised by USB 3.1 ports. Check the output of:

$ dmesg

The following commands will also help in showing if the adapter is correctly detected.

$ lsusb
Bus 004 Device 002: ID 4791:205a G-Technology ArmorATD
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 002: ID 2357:0103 TP-Link Archer T4UH wireless Realtek 8812AU
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

The Bus 003 Device 002: ID 2357:0103 entry above is the one with the USB Wi-Fi adapter on my system. You can remove the adapter and issue the command again and compare results to help you identify yours.

For non-USB adapters you can use:

$ lspci

More detailed information about the device can be obtained with the lshw command.

$ lshw -C network
WARNING: you should run this program as super-user.
  *-network                 
       description: Ethernet interface
       product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
       vendor: Realtek Semiconductor Co., Ltd.
       [output truncated]
    *-network:2
       description: Wireless interface
       physical id: 4
       bus info: usb@3:1
       logical name: enx18d6c70fbacc
       serial: 18:d6:c7:a1:22:ab
       capabilities: ethernet physical wireless
       configuration: broadcast=yes driver=rtl8812au ip=192.168.x.2 multicast=yes wireless=IEEE 802.11AC
WARNING: output may be incomplete or inaccurate, you should run this program as super-user.

This last command is really useful because it will give really important information about what driver to use.

In this case the chipset and driver to use is identified in this string driver=rtl8812au. We already knew this in any case. If yours is a different driver/adapter this solution is unlikely to work for you.

Checking the drivers

Now check that the driver is loaded, you need to look for a string that is similar to the driver string above.

$ lsmod | grep 8812
8812au                999424  0

If the module isn’t loaded you can manually load it:

# modprobe 8812au

Installing updated drivers

If all of the above seems to work but the Wi-Fi adapter isn’t detected you can install the drivers manually.

The version of the drivers is newer than the ones provided via apt.

Uninstall the system provided drivers

From the GUI:

  • Go to Software & Updates
  • Select Additional Drivers
  • Find the entry for the Wi-Fi adapter (rtl8812-au) and select Do not use the device

Or from the CLI:

[find the installed driver]

# apt list rtl8812au*

[and uninstall it]

# apt purge rtl8812au-dkms

Install alternative driver

Get the updated drivers from github:

$ git clone https://github.com/gordboy/rtl8812au-5.9.3.2

Move the source code to /usr/src so that DKMS can automatically build the driver when the kernel is updated.

# mv rtl8812au-5.9.3.2/ /usr/src/

Build and install the drivers:

# dkms add -m rtl8812au -v 5.9.3.2
# dkms build -m rtl8812au -v 5.9.3.2
# dkms install -m rtl8812au -v 5.9.3.2

Check that the driver is installed correctly:

# dkms status

Additionally:

[Make sure that in /etc/NetworkManager/NetworkManager.conf]

# vim /etc/NetworkManager/NetworkManager.conf

[The following entry is inserted]

[device]
wifi.scan-rand-mac-address=no

If the driver is recognised you can configure the wireless network as normal. Restart to make sure everything works and remains persistent.

Uninstall

If you ever need to uninstall the driver you can do it with:

# dkms remove -m rtl8812au -v 5.9.3.2 --all

If you edited /etc/modules you will need to revert the changes. In the previous tutorial for Ubuntu 18.04 the module had to be added manually. It isn’t the case for this version.




VirtualBox/KVM: Reduce VM sizes

There are two utilities that can help discard unused blocks so that VMs can be shrunk.

zerofree finds unused blocks with non-zero content in ext2, ext3 and ext4 filesystems and fills them with zeros. The volume can’ be mounted which makes the process of running it a bit convoluted.

fstrim will discard unused blocks on a mounted filesystem. It is best and preferred when working with SSD drives and thinly provisioned storage. It will work with more filesystems, and it won’t hammer your SSD with unnecessary writes.

It is recommended to use fstrim and only use zerofree if unavoidable.

CentOS 7/8

fstrim

# fstrim -va

zerofree (ext2, ext3, ext4)

# yum install epel-release
# yum install zerofree

[Reboot]
Press e on GRUB menu
Go to line that starts with 'linux'
Add init=/bin/bash
Ctrl-X

[Find which disk to trim]
# df
# zerofree -v /dev/mapper/centos_centos7-root

[Shutdown machine]

zerofree (xfs)

# yum install epel-release
# yum install zerofree

[Reboot]
Press e on GRUB menu
Go to line that starts with 'linux'
Change ro to rw
Add init=/bin/bash
Ctrl-X

[Find the partition/filesystem to trim]
# df

[Fill the filesystem with zeros. This will work with any filesystem but it will write a lot of data on your drives.]
# dd if=/dev/zero of=/tmp/dd bs=$((1024*1024)); rm /tmp/dd
# sync
# exit

[Shutdown machine]

Debian 9/10

fstrim

[Debian 9]
# fstrim -va

[Debian 10]
# fstrim -vA

zerofree

# apt install zerofree

[Reboot]
Press e on GRUB menu
Go to line that starts with 'linux'
Add init=/bin/bash
Ctrl-X

[Find disk to trim]
# df
# zerofree -v /dev/sda1

[Shutdown machine]

Ubuntu 18.04/20.04

[Ubuntu 18.04]
# fstrim -va

[Ubuntu 20.04]
# fstrim -vA

Be aware that if you are using ZFS on Ubuntu (or any other distro) the above commands won’t work. In fact, it will generate a lot of extra writes on the filesystem.

Just ensure that ZFS is using compression, or avoid it in the guest system.

Reducing the image size

Virtualbox

[List all disks]
$ vboxmanage list hdds

[Just the paths]
$ vboxmanage list hdds | grep  'Location.*.vdi' | awk '{$1=""}1'

[Compress one image]
$ vboxmanage modifymedium disk --compact /home/user/Virtualbox/Kali-Linux-2021.1/Kali-Linux-2020.4-vbox-amd64-disk001.vdi

[List all images path]
$ vboxmanage list hdds | grep  'Location.*.vdi' | awk '{$1=""}1' | sed 's/^ /"/;s/$/"/'

I wish I knew the syntax to automatise compressing all the images with one line. I might revisit it in the future with a script.

KVM

# qemu-img convert -O qcow2 originalfile compressedfile

I have a script to do all of the files in one go:

#!/bin/sh

# All images
for file_name in `ls -1 *.cow2`

do
	echo
	echo ==================
	echo Image: $file_name
	echo -n Old `qemu-img info $file_name | grep 'disk\ size'` ; echo
	mv $file_name $file_name.tmp
	qemu-img convert -O qcow2 $file_name.tmp $file_name
	rm $file_name.tmp
	echo -n New `qemu-img info $file_name | grep 'disk\ size'` ; echo
	echo ==================
done



Ubuntu 20.4: Virtualbox not running after the last upgrade

When launching a VM in Virtualbox I got an error saying that it can’t be started because a required module isn’t loaded. It suggests to manually load it.

# modprobe vboxdrv

[This outputs an error message]

modprobe: FATAL: Module vboxdrv not found in directory /lib/modules/5.8.0-34-generic

Re-installing Virtualbox also fails because the virtualbox-dkms package can’t be configured

# apt install virtualbox virtualbox-dkms
[...]
Removing old virtualbox-6.1.10 DKMS files...

------------------------------
Deleting module version: 6.1.10
completely from the DKMS tree.
------------------------------
Done.
Loading new virtualbox-6.1.10 DKMS files...
Building for 5.8.0-34-generic 5.8.0-36-generic
Building initial module for 5.8.0-34-generic
ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/virtualbox-dkms.0.crash'
Error! Bad return status for module build on kernel: 5.8.0-34-generic (x86_64)
Consult /var/lib/dkms/virtualbox/6.1.10/build/make.log for more information.
[..]
E: Sub-process /usr/bin/dpkg returned an error code (1)

The last system update upgraded the kernel from 5.4 to 5.8, and there is something in the new kernel that breaks Virtualbox.

There are two solutions:

Installing Virtualbox from source or downgrading to the previous kernel.

I have chosen the latter as I expect this to be a temporary issue, and a fix to be released soon.

The process to revert is simple.

Reboot and in the GRUB screen select Advanced Options.

Ubuntu 20.04.1 LTS
*Advanced options for Ubuntu 20.04.1 LTS
History for Ubuntu 20.04.1 LTS
UEFI Firmware Settings

Select a trusted 5.4 version to boot from. Most likely the 3rd option in the list. Your exact version numbers might differ from mine.

Ubuntu 20.04.1 LTS, with Linux 5.8.0-34-generic
Ubuntu 20.04.1 LTS, with Linux 5.8.0-34-generic (recovery mode)
*Ubuntu 20.04.1 LTS, with Linux 5.4.0-59-generic
Ubuntu 20.04.1 LTS, with Linux 5.4.0-59-generic (recovery mode)
Ubuntu 20.04.1 LTS, with Linux 5.4.0-54-generic
Ubuntu 20.04.1 LTS, with Linux 5.4.0-54-generic (recovery mode)

After the boot, check that you are running 5.4.

$ uname -r
5.4.0-59-generic

See which versions of 5.8 you have installed in your system.

$ apt list --installed | grep  linux-image

Make a note of the 5.8 versions listed (or use grep again), and remove them manually.

# apt remove linux-image-unsigned-5.8.0-34-generic

Virtualbox should be working.

Linux 5.8 seems to have been removed for the time being, so if you run any updates you are safe.




Ubuntu: ZFS bpool is full and not running snapshots during apt updates

When running apt to update my system I kept seeing a message saying that bpool had less than 20% space free and that the automatic snapshotting would not run.

What I didn’t realise is that this would apply to the rpool even if it had plenty of free space. They are run together and have to match. Checking the snapshots it seems they had stopped running for several months. Yikes!

You can list the current snapshots in several ways:

[List existing snapshots with their names and creation date.]

$ zsysctl show
Name:           rpool/ROOT/ubuntu_dd5xf4
ZSys:           true
Last Used:      current
History:        
  - Name:       rpool/ROOT/ubuntu_dd5xf4@autozsys_qfi5pz
    Created on: 2021-01-12 23:35:01
  - Name:       rpool/ROOT/ubuntu_dd5xf4@autozsys_1osqbq
    Created on: 2021-01-12 23:33:22

You can also use the zfs commands for the same purpose.

List existing snapshots with default properties information
(name, used, references, mountpoint)

$ zfs list -t snapshot

You can also list the creation date asking for the creation property.

$ zfs list -t snapshot -o name,creation

It should list then in creation order, but if not, you can use -s option to sort them.

$ zfs list -t snapshot -o name,creation -s creation

Deciding which snapshots to delete will vary. You might want to get rid of the older ones, or maybe the ones that are consuming the most space.

My snapshots were a few months old so there wasn’t much point in keeping them. I deleted all with the following one-liner:

[-H removes headers]
[-o name displays the name of the filesystem]
[-t snapshot displays only snapshots]

# zfs list -H -o name -t snapshot | grep auto | xargs -n1 zfs destroy

I can’t stress how important it is that whatever zfs destroy command you issue, especially if doing several automatic iterations, only applies to the snapshots you want to.

You can delete filesystems, volumes and snapshots with the above command. Deleting snapshots isn’t an issue. Deleting the filesystem is a pretty big one.

Please, ensure that the command lists only snapshots you want to remove before running it. You have been warned.




Ubuntu: System freezing for a few seconds with iwlwifi microcode sw error

For a few months now my main system would momentarily freeze or stall (usually about 20-30 seconds) and then continue working. It was something that started after one system update and wasn’t fixed with any further updates.

I opened a bug with Debian without much luck.

The system would notify that one of the CPU cores timed out and for a few moments the computer would stall or freeze before resuming as if nothing had happened.

dmesg was showing timeouts related to iwlwifi:

[ 2313.312941] Timeout waiting for hardware access (CSR_GP_CNTRL 0x0c04000c)
[ 2313.312995] WARNING: CPU: 4 PID: 1424 at drivers/net/wireless/intel/iwlwifi/pcie/trans.c:2066 iwl_trans_pcie_grab_nic_access+0x1f9/0x230 [iwlwifi]

iwlwifi is the kernel driver for several Intel based wireless adapters.

It is possible to install a different versions of the driver manually but I don’t like to deviate too much from a standard installation. It can complicate maintenance and troubleshooting in the future.

The issue would happen several times throughout the day. The truth is that with some of the updates it became less frequent, but it was still happening often enough and filling the syslog with errors.

Sep 8 14:02:53 tuxedo kernel: [15317.424052] iwlwifi 0000:08:00.0: Microcode SW error detected. Restarting 0x0.
Sep 8 14:02:53 tuxedo kernel: [15317.467034] WARNING: CPU: 4 PID: 1350 at drivers/net/wireless/intel/iwlwifi/mvm/../iwl-trans.h:1180 iwl_mvm_dump_lmac_error_log+0x51d/0x570 [iwlmvm]
Sep 8 14:02:53 tuxedo kernel: [15317.467096] RIP: 0010:iwl_mvm_dump_lmac_error_log+0x51d/0x570 [iwlmvm]
Sep 8 14:02:53 tuxedo kernel: [15317.467120] iwl_mvm_dump_nic_error_log+0x20/0x70 [iwlmvm]
Sep 8 14:02:53 tuxedo kernel: [15317.467126] iwl_mvm_nic_error+0x35/0x40 [iwlmvm]
Sep 8 14:02:53 tuxedo kernel: [15317.467146] iwl_pcie_irq_handle_error+0xb3/0x110 [iwlwifi]
Sep 8 14:02:53 tuxedo kernel: [15317.467172] iwlwifi 0000:08:00.0: HW error, resetting before reading
Sep 8 14:02:53 tuxedo kernel: [15317.474328] iwlwifi 0000:08:00.0: Start IWL Error Log Dump:
Sep 8 14:02:53 tuxedo kernel: [15317.474407] iwlwifi 0000:08:00.0: Start IWL Error Log Dump:
Sep 8 14:02:53 tuxedo kernel: [15317.474510] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | FSEQ_ERROR_CODE
Sep 8 14:02:53 tuxedo kernel: [15317.476063] iwlwifi 0000:08:00.0: FW error in SYNC CMD STATISTICS_CMD
[...]
Sep 8 16:38:15 tuxedo kernel: [24620.634090] iwlwifi 0000:08:00.0: Microcode SW error detected. Restarting 0x0.
Sep 8 16:38:15 tuxedo kernel: [24620.677672] WARNING: CPU: 4 PID: 1350 at drivers/net/wireless/intel/iwlwifi/mvm/../iwl-trans.h:1180 iwl_mvm_dump_lmac_error_log+0x51d/0x570 [iwlmvm]
Sep 8 16:38:15 tuxedo kernel: [24620.677672] Modules linked in: ccm btrfs xor zstd_compress raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs rfcomm vboxnetadp(OE) vboxnetflt(OE) xfrm_user xfrm_algo vboxdrv(OE) cmac algif_hash algif_skcipher af_alg bnep nls_iso8859_1 snd_hda_codec_hdmi snd_sof_pci snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda snd_sof_intel_byt snd_sof_intel_ipc snd_sof snd_sof_xtensa_dsp snd_hda_ext_core snd_hda_codec_realtek snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_generic ledtrig_audio snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi mei_hdcp snd_seq_midi_event intel_rapl_msr snd_rawmidi iwlmvm mac80211 x86_pkg_temp_thermal btusb intel_powerclamp btrtl snd_seq uvcvideo btbcm libarc4 btintel kvm_intel snd_seq_device bluetooth videobuf2_vmalloc snd_timer videobuf2_memops kvm videobuf2_v4l2 videobuf2_common videodev ecdh_generic rapl mc ecc intel_cstate input_leds snd iwlwifi joydev
Sep 8 16:38:15 tuxedo kernel: [24620.677771] CPU: 4 PID: 1350 Comm: irq/149-iwlwifi Tainted: P W OEL 5.4.0-47-generic #51-Ubuntu
Sep 8 16:38:15 tuxedo kernel: [24620.677828] iwl_pcie_irq_handle_error+0xb3/0x110 [iwlwifi]
Sep 8 16:38:15 tuxedo kernel: [24620.677834] iwl_pcie_irq_msix_handler+0x180/0x4a0 [iwlwifi]
Sep 8 16:38:15 tuxedo kernel: [24620.677859] iwlwifi 0000:08:00.0: HW error, resetting before reading
Sep 8 16:38:15 tuxedo kernel: [24620.684849] iwlwifi 0000:08:00.0: Start IWL Error Log Dump:
Sep 8 16:38:15 tuxedo kernel: [24620.684853] iwlwifi 0000:08:00.0: Status: 0x00000040, count: -1156803901
Sep 8 16:38:15 tuxedo kernel: [24620.684854] iwlwifi 0000:08:00.0: Loaded firmware version: 46.6bf1df06.0
Sep 8 16:38:15 tuxedo kernel: [24620.684855] iwlwifi 0000:08:00.0: 0x7EEF56BF | ADVANCED_SYSASSERT
Sep 8 16:38:15 tuxedo kernel: [24620.684857] iwlwifi 0000:08:00.0: 0x570D67A4 | trm_hw_status0
Sep 8 16:38:15 tuxedo kernel: [24620.684858] iwlwifi 0000:08:00.0: 0xA73BAA53 | trm_hw_status1
Sep 8 16:38:15 tuxedo kernel: [24620.684858] iwlwifi 0000:08:00.0: 0x514FE288 | branchlink2
Sep 8 16:38:15 tuxedo kernel: [24620.684860] iwlwifi 0000:08:00.0: 0xF788EFED | interruptlink1
Sep 8 16:38:15 tuxedo kernel: [24620.684861] iwlwifi 0000:08:00.0: 0x6A36568B | interruptlink2
Sep 8 16:38:15 tuxedo kernel: [24620.684862] iwlwifi 0000:08:00.0: 0xED77E0BF | data1
Sep 8 16:38:15 tuxedo kernel: [24620.684863] iwlwifi 0000:08:00.0: 0x02150D02 | data2
Sep 8 16:38:15 tuxedo kernel: [24620.684864] iwlwifi 0000:08:00.0: 0xF7226FBF | data3
Sep 8 16:38:15 tuxedo kernel: [24620.684864] iwlwifi 0000:08:00.0: 0xAA60560C | beacon time
Sep 8 16:38:15 tuxedo kernel: [24620.684865] iwlwifi 0000:08:00.0: 0xC6D77EF3 | tsf low
Sep 8 16:38:15 tuxedo kernel: [24620.684867] iwlwifi 0000:08:00.0: 0x6007359C | tsf hi
Sep 8 16:38:15 tuxedo kernel: [24620.684868] iwlwifi 0000:08:00.0: 0x19FB8975 | time gp1
Sep 8 16:38:15 tuxedo kernel: [24620.684869] iwlwifi 0000:08:00.0: 0x007831E0 | time gp2
Sep 8 16:38:15 tuxedo kernel: [24620.684870] iwlwifi 0000:08:00.0: 0x63B3FF9E | uCode revision type
Sep 8 16:38:15 tuxedo kernel: [24620.684871] iwlwifi 0000:08:00.0: 0x050B9CF9 | uCode version major
Sep 8 16:38:15 tuxedo kernel: [24620.684872] iwlwifi 0000:08:00.0: 0xF195AF3A | uCode version minor
Sep 8 16:38:15 tuxedo kernel: [24620.684873] iwlwifi 0000:08:00.0: 0xFC8B56B7 | hw version
Sep 8 16:38:15 tuxedo kernel: [24620.684874] iwlwifi 0000:08:00.0: 0xDF4DF7DF | board version
Sep 8 16:38:15 tuxedo kernel: [24620.684875] iwlwifi 0000:08:00.0: 0xFFFFF764 | hcmd
Sep 8 16:38:15 tuxedo kernel: [24620.684876] iwlwifi 0000:08:00.0: 0x61242F92 | isr0
Sep 8 16:38:15 tuxedo kernel: [24620.684877] iwlwifi 0000:08:00.0: 0x6BF774FF | isr1
Sep 8 16:38:15 tuxedo kernel: [24620.684878] iwlwifi 0000:08:00.0: 0x2BF9DCB6 | isr2
Sep 8 16:38:15 tuxedo kernel: [24620.684879] iwlwifi 0000:08:00.0: 0xFFFFB9ED | isr3
Sep 8 16:38:15 tuxedo kernel: [24620.684880] iwlwifi 0000:08:00.0: 0xE50361AD | isr4
Sep 8 16:38:15 tuxedo kernel: [24620.684881] iwlwifi 0000:08:00.0: 0x7F5B434C | last cmd Id
Sep 8 16:38:15 tuxedo kernel: [24620.684882] iwlwifi 0000:08:00.0: 0x6235DCC3 | wait_event
Sep 8 16:38:15 tuxedo kernel: [24620.684883] iwlwifi 0000:08:00.0: 0xFDBBEBCF | l2p_control
Sep 8 16:38:15 tuxedo kernel: [24620.684884] iwlwifi 0000:08:00.0: 0xB018A732 | l2p_duration
Sep 8 16:38:15 tuxedo kernel: [24620.684885] iwlwifi 0000:08:00.0: 0x637FAB7A | l2p_mhvalid
Sep 8 16:38:15 tuxedo kernel: [24620.684886] iwlwifi 0000:08:00.0: 0xA4283522 | l2p_addr_match
Sep 8 16:38:15 tuxedo kernel: [24620.684887] iwlwifi 0000:08:00.0: 0xF6CBFCCA | lmpm_pmg_sel
Sep 8 16:38:15 tuxedo kernel: [24620.684888] iwlwifi 0000:08:00.0: 0x172A8B18 | timestamp
Sep 8 16:38:15 tuxedo kernel: [24620.684889] iwlwifi 0000:08:00.0: 0xF97EB79D | flow_handler
Sep 8 16:38:15 tuxedo kernel: [24620.684942] iwlwifi 0000:08:00.0: Start IWL Error Log Dump:
Sep 8 16:38:15 tuxedo kernel: [24620.684943] iwlwifi 0000:08:00.0: Status: 0x00000040, count: -179974165
Sep 8 16:38:15 tuxedo kernel: [24620.684944] iwlwifi 0000:08:00.0: 0x587791CE | ADVANCED_SYSASSERT
Sep 8 16:38:15 tuxedo kernel: [24620.684945] iwlwifi 0000:08:00.0: 0xEFE6FFBD | umac branchlink1
Sep 8 16:38:15 tuxedo kernel: [24620.684946] iwlwifi 0000:08:00.0: 0x68924F38 | umac branchlink2
Sep 8 16:38:15 tuxedo kernel: [24620.684947] iwlwifi 0000:08:00.0: 0xADDFFAAA | umac interruptlink1
Sep 8 16:38:15 tuxedo kernel: [24620.684948] iwlwifi 0000:08:00.0: 0x5F1F8880 | umac interruptlink2
Sep 8 16:38:15 tuxedo kernel: [24620.684949] iwlwifi 0000:08:00.0: 0xF9D3AF77 | umac data1
Sep 8 16:38:15 tuxedo kernel: [24620.684950] iwlwifi 0000:08:00.0: 0x1CFAD4EC | umac data2
Sep 8 16:38:15 tuxedo kernel: [24620.684951] iwlwifi 0000:08:00.0: 0x4ED9F44D | umac data3
Sep 8 16:38:15 tuxedo kernel: [24620.684952] iwlwifi 0000:08:00.0: 0x8B07B711 | umac major
Sep 8 16:38:15 tuxedo kernel: [24620.684954] iwlwifi 0000:08:00.0: 0x7BDAEF3C | umac minor
Sep 8 16:38:15 tuxedo kernel: [24620.684955] iwlwifi 0000:08:00.0: 0xA2D50203 | frame pointer
Sep 8 16:38:15 tuxedo kernel: [24620.684956] iwlwifi 0000:08:00.0: 0xF35DCF8D | stack pointer
Sep 8 16:38:15 tuxedo kernel: [24620.684956] iwlwifi 0000:08:00.0: 0xCA5E1060 | last host cmd
Sep 8 16:38:15 tuxedo kernel: [24620.684957] iwlwifi 0000:08:00.0: 0xFBF4D7E7 | isr status reg
Sep 8 16:38:15 tuxedo kernel: [24620.684981] iwlwifi 0000:08:00.0: Fseq Registers:
Sep 8 16:38:15 tuxedo kernel: [24620.685050] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | FSEQ_ERROR_CODE
Sep 8 16:38:15 tuxedo kernel: [24620.685185] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | FSEQ_TOP_INIT_VERSION
Sep 8 16:38:15 tuxedo kernel: [24620.685338] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | FSEQ_CNVIO_INIT_VERSION
Sep 8 16:38:15 tuxedo kernel: [24620.685490] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | FSEQ_OTP_VERSION
Sep 8 16:38:15 tuxedo kernel: [24620.685629] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | FSEQ_TOP_CONTENT_VERSION
Sep 8 16:38:15 tuxedo kernel: [24620.685764] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | FSEQ_ALIVE_TOKEN
Sep 8 16:38:15 tuxedo kernel: [24620.685899] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | FSEQ_CNVI_ID
Sep 8 16:38:15 tuxedo kernel: [24620.686035] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | FSEQ_CNVR_ID
Sep 8 16:38:15 tuxedo kernel: [24620.686170] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | CNVI_AUX_MISC_CHIP
Sep 8 16:38:15 tuxedo kernel: [24620.686305] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | CNVR_AUX_MISC_CHIP
Sep 8 16:38:15 tuxedo kernel: [24620.686455] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | CNVR_SCU_SD_REGS_SD_REG_DIG_DCDC_VTRIM
Sep 8 16:38:15 tuxedo kernel: [24620.686593] iwlwifi 0000:08:00.0: 0xA5A5A5A2 | CNVR_SCU_SD_REGS_SD_REG_ACTIVE_VDIG_MIRROR
Sep 8 16:38:15 tuxedo kernel: [24620.686616] iwlwifi 0000:08:00.0: Collecting data: trigger 2 fired.
Sep 8 16:38:43 tuxedo kernel: [24649.245319] Modules linked in: ccm btrfs xor zstd_compress raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs rfcomm vboxnetadp(OE) vboxnetflt(OE) xfrm_user xfrm_algo vboxdrv(OE) cmac algif_hash algif_skcipher af_alg bnep nls_iso8859_1 snd_hda_codec_hdmi snd_sof_pci snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda snd_sof_intel_byt snd_sof_intel_ipc snd_sof snd_sof_xtensa_dsp snd_hda_ext_core snd_hda_codec_realtek snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_generic ledtrig_audio snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi mei_hdcp snd_seq_midi_event intel_rapl_msr snd_rawmidi iwlmvm mac80211 x86_pkg_temp_thermal btusb intel_powerclamp btrtl snd_seq uvcvideo btbcm libarc4 btintel kvm_intel snd_seq_device bluetooth videobuf2_vmalloc snd_timer videobuf2_memops kvm videobuf2_v4l2 videobuf2_common videodev ecdh_generic rapl mc ecc intel_cstate input_leds snd iwlwifi joydev
Sep 8 16:38:43 tuxedo kernel: [24649.245403] Workqueue: events iwl_fw_error_dump_wk [iwlwifi]
Sep 8 16:38:43 tuxedo kernel: [24649.245420] iwl_trans_pcie_release_nic_access+0x61/0x70 [iwlwifi]
Sep 8 16:38:43 tuxedo kernel: [24649.245424] iwl_trans_pcie_read_mem+0x94/0xc0 [iwlwifi]
Sep 8 16:38:43 tuxedo kernel: [24649.245429] iwl_fw_dump_mem.isra.0.part.0+0x50/0x90 [iwlwifi]
Sep 8 16:38:43 tuxedo kernel: [24649.245434] iwl_fw_error_dump_file.isra.0+0x438/0xfe0 [iwlwifi]
Sep 8 16:38:43 tuxedo kernel: [24649.245437] iwl_fw_dbg_collect_sync+0xe7/0x310 [iwlwifi]
Sep 8 16:38:43 tuxedo kernel: [24649.245444] iwl_fw_error_dump_wk+0x59/0x80 [iwlwifi]
Sep 8 16:38:44 tuxedo kernel: [24649.716905] iwlwifi 0000:08:00.0: Failing on timeout while stopping DMA channel 8 [0xa5a5a5a2]
Sep 8 16:38:44 tuxedo kernel: [24649.775318] iwlwifi 0000:08:00.0: Applying debug destination EXTERNAL_DRAM
Sep 8 16:38:44 tuxedo kernel: [24649.890159] iwlwifi 0000:08:00.0: Applying debug destination EXTERNAL_DRAM
Sep 8 16:38:44 tuxedo kernel: [24649.954830] iwlwifi 0000:08:00.0: FW already configured (0) - re-configuring
Sep 8 16:38:44 tuxedo kernel: [24649.972183] iwlwifi 0000:08:00.0: BIOS contains WGDS but no WRDS

Doing some reading it seems that Intel wireless drivers have some known issues. It seems that there isn’t much that can be done from that side. It is very likely that even newer drivers and firmware would behave the same.

But ignoring recurring issues is never good practice!

There was a note on the Debian’s wiki about how to disable driver options for troubleshooting. On the Arch Linux forum the user mkdy created a modprobe file to do that after experiencing similar freezes.

I tried his workaround and it also works on my Ubuntu system.

Create /etc/modprobe.d/iwl.conf and add the following content:

options iwlwifi 11n_disable=1 swcrypto=0 bt_coex_active=0 power_save=0
options iwlmvm power_scheme=1 
options iwlwifi d0i3_disable=1 
options iwlwifi uapsd_disable=1 
options iwlwifi lar_disable=1

After rebooting the system the errors and freezes stopped. It could be that not all of the options are needed. If I have time I will experiment and try to determine if one in particular is the one responsible for my freezes.

Addendum

A few weeks after I applied the patch and experienced no more entries in the log I noticed me experiencing lag online. I then noticed that my log had some new entries related to iwlwifi and to top it off I realised that the above settings made the connection slower.

I don’t know if this was caused by the last kernel and system updates, it could be. The system is on 5.4.0-47 which is currently the latest release on Ubuntu 20.04.

I ended up testing the different options on /etc/modprobe.d/iwl.conf. This entry seems to remove the syslog iwlwifi entries, the random freezes, the lag and the slow connection.

options iwlwifi bt_coex_active=0 swcrypto=0 power_save=0
options iwlmvm power_scheme=1

I am leaving all of the previous above for reference in case you are trying to troubleshoot a similar issue.

I will keep a look on future updates to make sure it doesn’t break again.




Ubuntu/Linux: systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.

In one of my systems the system log was reporting every 2-3 minutes the following error message:

Sep  3 13:43:57 tux1 systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Sep  3 13:45:34 tux1 systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Sep  3 13:48:58 tux1 systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Sep  3 13:50:34 tux1 systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Sep  3 13:53:56 tux1 systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.

This was caused by a mismatch between the systemd configuration and /etc/resolv.conf.

/etc/resolv.conf should be a symbolic link pointing to the systemd DNS configuration in /run/systemd/resolve/resolv.conf

You can check if this is in place just by listing the file.

$ ls -l /etc/resolv.conf

If it isn’t pointing to the right file (and you are using systemd) you can fix it:

# rm /etc/resolv.conf
# ln -s /etc/resolv.conf  /run/systemd/resolve/resolv.conf

The errors stopped after this fix.

$ cat /var/log/syslog | grep -i error | grep -i dns



Ubuntu/GRUB: Error: invalid environment block

After a recent system update I got the following error message:

Error: invalid environment block

Press any key to continue...

Luckily the system would boot up but ignoring errors isn’t best practice. This error is caused by a faulty GRUB2 environment block. This is a file located in /boot/grub/grubenv.

You can easily regenerate it with the following commands. It is advisable to make a backup copy the file just in case you need to revert.

# grub-editenv grubenv create
# grub-editenv grubenv set default=
# grub-editenv grubenv list
# update-grub

After rebooting the message should have disappeared.

If you can’t boot from your system drive you can use a Live CD and then mount your system’s boot partition and apply the same commands.

I haven’t tested this part personally but maybe the commands will help as a reference. Details are scarce on purpose, check what the commands do before doing anything.

# mount /dev/sda1 /mnt/boot/efi
or
# mount /dev/sda1 /mnt/boot/

# grub-editenv /mnt/boot/grub/grubenv grubenv create 
# grub-editenv /mnt/boot/grub/grubenv grubenv set default=
# grub-editenv /mnt/boot/grub/grubenv grubenv list
# grub-mkconfig  -o /mnt/boot/grub/grub.cfg

The other approach, also untested by me, could involve chroot.

# mount /dev/sda2 /target
# mount --bind /dev /target/dev
# mount --bind /dev/pts /target/dev/pts
# mount --bind /sys /target/sys
# mount --bind /proc /target/proc
# mount /dev/sda1 /target/boot

chroot /target

# grub-editenv grubenv create
# grub-editenv grubenv set default=
# grub-editenv grubenv list
# update-grub



Ubuntu 20.04: Install Ubuntu with ZFS and encryption

Ubuntu 20.04 offers installing ZFS as the default filesystem. This has lots of advantages. My favourite is being able to revert the system and home partitions (simultaneously or individually) to a previous state through the boot menu.

One major drawback for me is the lack of an option to encrypt the filesystem during the installation.

You have the option to use LUKS and ext4 but there isn’t an encryption option in the installer for ZFS.

Some people have used LUKS and ZFS in the past, but that solution didn’t quite work for me. The tutorials I saw were using LUKS1 instead of LUKS2 and it also felt that the approach was cumbersome now that ZFS on Linux supports native encryption.

The more you deviate from a standard installation the more complicated it will be to do any troubleshooting if anything breaks in the future. Keep it simple.

The ZFS on Linux version included with the 20.04 installer is 0.8.3.

The installation of Ubuntu 20.04 on ZFS will create two pools: bpool and rpool.

bpool contains the boot partition and rpool all the other mountpoints in several datasets.

In a very security minded world both pools should be encrypted, but I prefer not encrypt the boot partition. Adding that extra layer of security might make a system recovery that much more difficult or impossible.

The default partitioning during the install creates four partitions and two ZFS pools, using all the storage in the installation disk:

/boot/efi 512MiB EFI System Partition (vfat)
SWAP 2GiB Linux Swap Partition (swap)
bpool 2GiB ZFS/Solaris boot partition (zfs)
rpool all remaining space ZFS/Solaris root partition (zfs)

To encrypt the rpool we will need to edit the installation script.

Steps

  • Click the “Try Ubuntu” button.
  • Open a terminal window.
  • Edit /usr/share/ubiquity/zsys-setup
# vim /usr/share/ubiquity/zsys-setup

This script is responsible for setting up ZFS. We can modify the default options for rpool.

  • Edit the rpool section from this:
# Pools
        # rpool
        zpool create -f \
                -o ashift=12 \
                -O compression=lz4 \
                -O acltype=posixacl \
                -O xattr=sa \
                -O relatime=on \
                -O normalization=formD \
                -O mountpoint=/ \
                -O canmount=off \
                -O dnodesize=auto \
                -O sync=disabled \
                -O mountpoint=/ -R "${target}" rpool "${partrpool}"

to this:

# Pools
        # rpool
        echo PASSWORD | zpool create -f \
                -o ashift=12 \
                -O compression=lz4 \
                -O acltype=posixacl \
                -O xattr=sa \
                -O relatime=on \
                -O normalization=formD \
                -O mountpoint=/ \
                -O canmount=off \
                -O dnodesize=auto \
                -O sync=disabled \
                -O recordsize=1M \
                -O encryption=aes-256-gcm \
                -O keylocation=prompt \
                -O keyformat=passphrase \
                -O mountpoint=/ -R "${target}" rpool "${partrpool}"
  • Replace PASSWORD with the encryption password you want to use. You will be prompted to type this at boot time.
  • Save the changes to the file and exit.
  • Launch the installer:
# ubiquity
  • Install Ubuntu as you would.
    In the storage section:
  • Select “Use entire disk”
  • Select ZFS (Experimental)

The system will be installed with the encryption options set on the script and on boot it will prompt you with the password you setup.


Some comments on the options for reference:

-o ashift=12
This is the default setting that means that your disk’s block size is 4,096 bytes (2^12=4,096). Valid values are:

0 for autodetect sector size
9 for 512 bytes
10 for 1,024 bytes
11 for 2,048
12 for 4,096
13 for 8,192
14 for 16,384
15 for 32,768
16 for 65,536

You can output the physical sector size with lsblk -t, although values of 512 might be simulated. You should check the specifications if the drive is SSD.

Alternative ways to retrieve physical sector sizes are:

$ cat /sys/block/sd*/queue/physical_block_size
# hdparm -I /dev/sda | grep Sector

A value of 12 will work just fine, even on 512 sector drives and likely being the reason for Canonical setting up as the default.

If set too low this can have a huge and negative impact on performance.

-O recordsize=1M
Other tutorials suggest creating this entry. According to Oracle’s documentation this parameter is used for databases and I have read that it can also be used for certain types of VMs.

The default value is 128k. You can tune it for your individual use by changing the record size of an existing pool. Any new files created will use the new record size value. You can cp/rm files to force them to be rewritten with the new value.

You can change this value later on with:

# zfs set recordsize=128k rpool

or

# zfs set recordsize=128k rpool/filesystem

-O encryption=aes-256-gcm
AES with key lengths of 128, 192 and 256 bits in CCM and GCM operation modes are supported natively.
0.8.4 comes with a fix that improves performance with AES-GCM and should hopefully be included in an update to Ubuntu soon.

-O keylocation=prompt
Valid options are prompt or file:// </absolute/file/path>

Prompt will ask you to type the password, in this case during boot.
File will point to the location of the decryption key, but on a portable system it would defy its purpose.

-O keyformat=passphrase
Options are raw, hex or passphrase.
When using passphrase the password can be between 8 and 512 bytes in length.


Additional information

Reference sites
Debian ZFS site
Ubuntu ZFS reference
FreeBSD ZFS reference

ZFS on Linux website / Admin documentation
ZFS on Linux manpage
OpenZFS System Administration
OpenZFS FAQ

Oracle ZFS Admin guide (not necessarily in line with ZFS on Linux)
Archlinux ZFS wiki
Alpine Linux with root on ZFS with native encryption wiki

Ars Technica intro to ZFS

Interesting articles on ZFS tuning:
Tuning ZFS recordsize (Oracle blog)
ZFS record size (Joyent blog)
OpenZFS performance tuning wiki




Ubuntu: Ubuntu 20.4 installing NVIDIA drivers breaks built-in audio on laptop

On a new laptop I couldn’t get the external HDMI monitor to work with the nouveau drivers, so I installed the NVIDIA drivers (version 440).

The NVIDIA drivers worked perfectly and the external monitor could be configured, but it didn’t take too long to notice that the built-in audio wasn’t working.

Audio would only play through HDMI. Disconnecting the monitor wouldn’t make the built-in audio work again.

No audio from the built-in speakers or headphones on a laptop isn’t good.

Reverting to the nouveau drivers wouldn’t fix the audio. I have to say that I have become a fan of the ZFS rollback feature on 20.04 in a flash. You can revert the system to how it was before any update that borks things. You can try different troubleshooting solutions and go back if needed. Big fan.

So, how to get the audio to work again?

First find the audio driver being used. There are several ways to find what audio driver you are using:

$ inxi -iF
[...]
Audio:
  Device-1: Intel Cannon Lake PCH cAVS driver: snd_hda_intel 
  Device-2: NVIDIA TU106 High Definition Audio driver: snd_hda_intel 
  Sound Server: ALSA v: k5.4.0-28-generic 
[...]
$ lshw -c multimedia
  *-multimedia              
       description: Audio device
       product: TU106 High Definition Audio Controller
       vendor: NVIDIA Corporation
       physical id: 0.1
       bus info: pci@0000:01:00.1
       version: a1
       width: 32 bits
       clock: 33MHz
       capabilities: bus_master cap_list
       configuration: driver=snd_hda_intel latency=0
       resources: irq:17 memory:b4000000-b4003fff
  *-multimedia
       description: Audio device
       product: Cannon Lake PCH cAVS
       vendor: Intel Corporation
       physical id: 1f.3
       bus info: pci@0000:00:1f.3
       version: 10
       width: 64 bits
       clock: 33MHz
       capabilities: bus_master cap_list
       configuration: driver=snd_hda_intel latency=32
       resources: irq:150 memory:b4618000-b461bfff memory:b4200000-b42fffff
$ lspci -v
[...]
00:1f.3 Audio device: Intel Corporation Cannon Lake PCH cAVS (rev 10)
        Subsystem: CLEVO/KAPOK Computer Cannon Lake PCH cAVS
        Flags: bus master, fast devsel, latency 32, IRQ 150
        Memory at b4618000 (64-bit, non-prefetchable) [size=16K]
        Memory at b4200000 (64-bit, non-prefetchable) [size=1M]
        Capabilities: <access denied>
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel, snd_sof_pci
[...]
01:00.1 Audio device: NVIDIA Corporation TU106 High Definition Audio Controller (rev a1)
        Subsystem: CLEVO/KAPOK Computer TU106 High Definition Audio Controller
        Flags: bus master, fast devsel, latency 0, IRQ 17
        Memory at b4000000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
[...]

From the above we can see that the audio driver being used is snd_hda_intel, which is quite common.

Second, find out the audio codecs in use:

$ cat /proc/asound/card0/codec*

The output after installing the NVIDIA drivers that stopped the audio from working shows a lot of UNKNOWN and N/A entries:

Codec: Realtek ALC1220
Address: 0
AFG Function Id: 0x1 (unsol 1)
Vendor Id: 0x10ec1220
Subsystem Id: 0x155896e1
Revision Id: 0x100003
No Modem Function Group found
Default PCM:
N/A
Default Amp-In caps: N/A
Default Amp-Out caps: N/A
State of AFG node 0x01:
  Power: setting=UNKNOWN, actual=UNKNOWN, Error, Clock-stop-OK, Setting-reset
Invalid AFG subtree
Codec: Intel Kabylake HDMI
Address: 2
AFG Function Id: 0x1 (unsol 0)
Vendor Id: 0x8086280b
Subsystem Id: 0x80860101
Revision Id: 0x100000
No Modem Function Group found
Default PCM:
N/A
Default Amp-In caps: N/A
Default Amp-Out caps: N/A
State of AFG node 0x01:
  Power: setting=UNKNOWN, actual=UNKNOWN, Error, Clock-stop-OK, Setting-reset
Invalid AFG subtree

But a normal/working output would be similar to this:

Codec: Realtek ALC1220
Address: 0
AFG Function Id: 0x1 (unsol 1)
Vendor Id: 0x10ec1220
Subsystem Id: 0x155896e1
Revision Id: 0x100003
No Modem Function Group found
Default PCM:
    rates [0x5f0]: 32000 44100 48000 88200 96000 192000
    bits [0xe]: 16 20 24
    formats [0x1]: PCM
Default Amp-In caps: N/A
Default Amp-Out caps: N/A
State of AFG node 0x01:
  Power states: D0 D1 D2 D3 D3cold CLKSTOP EPSS
  Power: setting=D0, actual=D0
GPIO: io=8, o=0, i=0, unsolicited=1, wake=0
  IO[0]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[1]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[2]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[3]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[4]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[5]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[6]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[7]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
Node 0x02 [Audio Output] wcaps 0x41d: Stereo Amp-Out
  Control: name="Line Out Playback Volume", index=0, device=0
 [...]

From all of the above we can determine that the audio driver used is snd_hda_intel and that the codec is Realtek ALC1220.

It is very likely that your driver will be the same but the codec might vary. If using snd_hda_intel you can lookup what model variant you need searching the codec name in this list:

https://www.infradead.org/~mchehab/rst_conversion/sound/hd-audio/models.html

For the ALC1220 the model name to use seems to be dual-codecs.

Edit your ALSA configuration file:

# vim /etc/modprobe.d/alsa-base.conf

and add this to the end of the file:

# Manual entry to allow audio via headphones because NVIDIA drivers break the built-in audio
options snd-hda-intel model=clevo-p950
options snd-hda-intel probe_mask=0x1

I used the wrong model name by mistake. I meant to use dual-codecs but I used the model name just below in the list: clevo-p950. It worked and as it worked I haven’t gone back to edit it.

After updating your alsa configuration file reboot.

Just be more careful than me and choose the model name that matches your system.

After rebooting the audio from the built-in speakers and headphones were working.

You can change the output being used from your settings or using PulseAudio‘s volume control.