Ubuntu: ZFS bpool is full and not running snapshots during apt updates
When running apt to update my system I kept seeing a message saying that bpool had less than 20% space free and that the automatic snapshotting would not run.
What I didn’t realise is that this would apply to the rpool even if it had plenty of free space. They are run together and have to match. Checking the snapshots it seems they had stopped running for several months. Yikes!
You can list the current snapshots in several ways:
[List existing snapshots with their names and creation date.]
$ zsysctl show
Name: rpool/ROOT/ubuntu_dd5xf4
ZSys: true
Last Used: current
History:
- Name: rpool/ROOT/ubuntu_dd5xf4@autozsys_qfi5pz
Created on: 2021-01-12 23:35:01
- Name: rpool/ROOT/ubuntu_dd5xf4@autozsys_1osqbq
Created on: 2021-01-12 23:33:22
You can also use the zfs commands for the same purpose.
List existing snapshots with default properties information
(name, used, references, mountpoint)
$ zfs list -t snapshot
You can also list the creation date asking for the creation property.
$ zfs list -t snapshot -o name,creation
It should list then in creation order, but if not, you can use -s option to sort them.
$ zfs list -t snapshot -o name,creation -s creation
Deciding which snapshots to delete will vary. You might want to get rid of the older ones, or maybe the ones that are consuming the most space.
My snapshots were a few months old so there wasn’t much point in keeping them. I deleted all with the following one-liner:
[-H removes headers]
[-o name displays the name of the filesystem]
[-t snapshot displays only snapshots]
# zfs list -H -o name -t snapshot | grep auto | xargs -n1 zfs destroy
I can’t stress how important it is that whatever zfs destroy command you issue, especially if doing several automatic iterations, only applies to the snapshots you want to.
You can delete filesystems, volumes and snapshots with the above command. Deleting snapshots isn’t an issue. Deleting the filesystem is a pretty big one.
Please, ensure that the command lists only snapshots you want to remove before running it. You have been warned.
Ubuntu: System freezing for a few seconds with iwlwifi microcode sw error
For a few months now my main system would momentarily freeze or stall (usually about 20-30 seconds) and then continue working. It was something that started after one system update and wasn’t fixed with any further updates.
The system would notify that one of the CPU cores timed out and for a few moments the computer would stall or freeze before resuming as if nothing had happened.
dmesg was showing timeouts related to iwlwifi:
[ 2313.312941] Timeout waiting for hardware access (CSR_GP_CNTRL 0x0c04000c)
[ 2313.312995] WARNING: CPU: 4 PID: 1424 at drivers/net/wireless/intel/iwlwifi/pcie/trans.c:2066 iwl_trans_pcie_grab_nic_access+0x1f9/0x230 [iwlwifi]
iwlwifi is the kernel driver for several Intel based wireless adapters.
It is possible to install a different versions of the driver manually but I don’t like to deviate too much from a standard installation. It can complicate maintenance and troubleshooting in the future.
The issue would happen several times throughout the day. The truth is that with some of the updates it became less frequent, but it was still happening often enough and filling the syslog with errors.
Doing some reading it seems that Intel wireless drivers have some known issues. It seems that there isn’t much that can be done from that side. It is very likely that even newer drivers and firmware would behave the same.
But ignoring recurring issues is never good practice!
There was a note on the Debian’s wiki about how to disable driver options for troubleshooting. On the Arch Linux forum the user mkdy created a modprobe file to do that after experiencing similar freezes.
I tried his workaround and it also works on my Ubuntu system.
Create /etc/modprobe.d/iwl.conf and add the following content:
After rebooting the system the errors and freezes stopped. It could be that not all of the options are needed. If I have time I will experiment and try to determine if one in particular is the one responsible for my freezes.
Addendum
A few weeks after I applied the patch and experienced no more entries in the log I noticed me experiencing lag online. I then noticed that my log had some new entries related to iwlwifi and to top it off I realised that the above settings made the connection slower.
I don’t know if this was caused by the last kernel and system updates, it could be. The system is on 5.4.0-47 which is currently the latest release on Ubuntu 20.04.
I ended up testing the different options on /etc/modprobe.d/iwl.conf. This entry seems to remove the syslog iwlwifi entries, the random freezes, the lag and the slow connection.
I am leaving all of the previous above for reference in case you are trying to troubleshoot a similar issue.
I will keep a look on future updates to make sure it doesn’t break again.
Ubuntu/Linux: systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
In one of my systems the system log was reporting every 2-3 minutes the following error message:
Sep 3 13:43:57 tux1 systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Sep 3 13:45:34 tux1 systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Sep 3 13:48:58 tux1 systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Sep 3 13:50:34 tux1 systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Sep 3 13:53:56 tux1 systemd-resolved[2344]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
This was caused by a mismatch between the systemd configuration and /etc/resolv.conf.
/etc/resolv.conf should be a symbolic link pointing to the systemd DNS configuration in /run/systemd/resolve/resolv.conf
You can check if this is in place just by listing the file.
$ ls -l /etc/resolv.conf
If it isn’t pointing to the right file (and you are using systemd) you can fix it:
After a recent system update I got the following error message:
Error: invalid environment block
Press any key to continue...
Luckily the system would boot up but ignoring errors isn’t best practice. This error is caused by a faulty GRUB2 environment block. This is a file located in /boot/grub/grubenv.
You can easily regenerate it with the following commands. It is advisable to make a backup copy the file just in case you need to revert.
# grub-editenv grubenv create
# grub-editenv grubenv set default=
# grub-editenv grubenv list
# update-grub
After rebooting the message should have disappeared.
If you can’t boot from your system drive you can use a Live CD and then mount your system’s boot partition and apply the same commands.
I haven’t tested this part personally but maybe the commands will help as a reference. Details are scarce on purpose, check what the commands do before doing anything.
# mount /dev/sda1 /mnt/boot/efi
or
# mount /dev/sda1 /mnt/boot/
# grub-editenv /mnt/boot/grub/grubenv grubenv create
# grub-editenv /mnt/boot/grub/grubenv grubenv set default=
# grub-editenv /mnt/boot/grub/grubenv grubenv list
# grub-mkconfig -o /mnt/boot/grub/grub.cfg
The other approach, also untested by me, could involve chroot.
# mount /dev/sda2 /target
# mount --bind /dev /target/dev
# mount --bind /dev/pts /target/dev/pts
# mount --bind /sys /target/sys
# mount --bind /proc /target/proc
# mount /dev/sda1 /target/boot
chroot /target
# grub-editenv grubenv create
# grub-editenv grubenv set default=
# grub-editenv grubenv list
# update-grub
Ubuntu 20.04: Install Ubuntu with ZFS and encryption
Ubuntu 20.04 offers installing ZFS as the default filesystem. This has lots of advantages. My favourite is being able to revert the system and home partitions (simultaneously or individually) to a previous state through the boot menu.
One major drawback for me is the lack of an option to encrypt the filesystem during the installation.
You have the option to use LUKS and ext4 but there isn’t an encryption option in the installer for ZFS.
Some people have used LUKS and ZFS in the past, but that solution didn’t quite work for me. The tutorials I saw were using LUKS1 instead of LUKS2 and it also felt that the approach was cumbersome now that ZFS on Linux supports native encryption.
The more you deviate from a standard installation the more complicated it will be to do any troubleshooting if anything breaks in the future. Keep it simple.
The ZFS on Linux version included with the 20.04 installer is 0.8.3.
The installation of Ubuntu 20.04 on ZFS will create two pools: bpool and rpool.
bpool contains the boot partition and rpool all the other mountpoints in several datasets.
In a very security minded world both pools should be encrypted, but I prefer not encrypt the boot partition. Adding that extra layer of security might make a system recovery that much more difficult or impossible.
The default partitioning during the install creates four partitions and two ZFS pools, using all the storage in the installation disk:
/boot/efi
512MiB
EFI System Partition (vfat)
SWAP
2GiB
Linux Swap Partition (swap)
bpool
2GiB
ZFS/Solaris boot partition (zfs)
rpool
all remaining space
ZFS/Solaris root partition (zfs)
To encrypt the rpool we will need to edit the installation script.
Replace PASSWORD with the encryption password you want to use. You will be prompted to type this at boot time.
Save the changes to the file and exit.
Launch the installer:
# ubiquity
Install Ubuntu as you would. In the storage section:
Select “Use entire disk”
Select ZFS (Experimental)
The system will be installed with the encryption options set on the script and on boot it will prompt you with the password you setup.
Some comments on the options for reference:
-o ashift=12 This is the default setting that means that your disk’s block size is 4,096 bytes (2^12=4,096). Valid values are:
0 for autodetect sector size 9 for 512 bytes 10 for 1,024 bytes 11 for 2,048 12 for 4,096 13 for 8,192 14 for 16,384 15 for 32,768 16 for 65,536
You can output the physical sector size with lsblk -t, although values of 512 might be simulated. You should check the specifications if the drive is SSD.
Alternative ways to retrieve physical sector sizes are:
A value of 12 will work just fine, even on 512 sector drives and likely being the reason for Canonical setting up as the default.
If set too low this can have a huge and negative impact on performance.
-O recordsize=1M Other tutorials suggest creating this entry. According to Oracle’s documentation this parameter is used for databases and I have read that it can also be used for certain types of VMs.
The default value is 128k. You can tune it for your individual use by changing the record size of an existing pool. Any new files created will use the new record size value. You can cp/rm files to force them to be rewritten with the new value.
You can change this value later on with:
# zfs set recordsize=128k rpool
or
# zfs set recordsize=128k rpool/filesystem
-O encryption=aes-256-gcm AES with key lengths of 128, 192 and 256 bits in CCM and GCM operation modes are supported natively. 0.8.4 comes with a fix that improves performance with AES-GCM and should hopefully be included in an update to Ubuntu soon.
-O keylocation=prompt Valid options are prompt or file:// </absolute/file/path>
Prompt will ask you to type the password, in this case during boot. File will point to the location of the decryption key, but on a portable system it would defy its purpose.
-O keyformat=passphrase Options are raw, hex or passphrase. When using passphrase the password can be between 8 and 512 bytes in length.
Ubuntu/Debian: Not enough free space on disk ‘/boot’ when updating the OS
My /boot partition is only 512MB and I get this error message every now and then when updating:
Not enough free space
The upgrade needs a total of xx.x M free space on disk ‘/boot’. Please free at least an additional xx.x M of disk space on ‘/boot’. You can remove old kernels using ‘sudo apt autoremove’, and you could set COMPRESS=xz in /etc/initramfs-tools/initramfs.conf to reduce the size of your initramfs.
The obvious process is to expand /boot to be at least 1GB and be more careful in the future when partitioning during the OS installation.
Luckily there are a couple of things to try before repartitioning.
Try cleaning old kernels automatically:
# apt autoremove
Compress your initramfs by editing /etc/initramfs-tools/initramfs.conf
# vim /etc/initramfs-tools/initramfs.conf
and change the COMPRESS entry to:
COMPRESS=xz
You might need to rebuild your initramfs for the compression to start applying.
If after doing the above you still don’t have enough free space you can manually delete old kernels.
First check which Linux kernel you are on:
# uname -r
4.15.0-76-generic
In the example above the current kernel is 4.15.0-76. It is really important that the current used kernel is left untouched on the system. Under no circumstances should it be removed.
Check which kernels are on your system:
# dpkg -l | grep linux-image
rc linux-image-4.15.0-55-generic 4.15.0-55.60 amd64 Signed kernel image generic
rc linux-image-4.15.0-58-generic 4.15.0-58.64 amd64 Signed kernel image generic
rc linux-image-4.15.0-60-generic 4.15.0-60.67 amd64 Signed kernel image generic
rc linux-image-4.15.0-62-generic 4.15.0-62.69 amd64 Signed kernel image generic
rc linux-image-4.15.0-64-generic 4.15.0-64.73 amd64 Signed kernel image generic
rc linux-image-4.15.0-65-generic 4.15.0-65.74 amd64 Signed kernel image generic
rc linux-image-4.15.0-66-generic 4.15.0-66.75 amd64 Signed kernel image generic
rc linux-image-4.15.0-69-generic 4.15.0-69.78 amd64 Signed kernel image generic
rc linux-image-4.15.0-70-generic 4.15.0-70.79 amd64 Signed kernel image generic
ii linux-image-4.15.0-72-generic 4.15.0-72.81 amd64 Signed kernel image generic
ii linux-image-4.15.0-74-generic 4.15.0-74.84 amd64 Signed kernel image generic
ii linux-image-4.15.0-76-generic 4.15.0-76.86 amd64 Signed kernel image generic
ii linux-image-generic 4.15.0.76.78 amd64 Generic Linux kernel image
The first column of the output provides a 2-3 letter code with useful information on the status of each package.
For reference this is their meaning:
First letter. Desired package state:
u ... unknown
i ... install
r ... remove/deinstall
p ... purge (remove including config files)
h ... hold
Second letter. Current package state:
n ... not-installed
i ... installed
c ... config-files (only config files are installed)
U ... unpacked
F ... half-configured (configuration failed for some reason)
h ... half-installed (installation failed for some reason)
W ... triggers-awaited (package is waiting for a trigger from another package)
t ... triggers-pending (package has been triggered)
Third letter. Error state:
R ... reinstallation-required (package broken, reinstallation required)
From the previous output we know that there are some config files left around (rc header), and that several kernel images are still installed (ii header).
The ii ones are the ones consuming the space we need to free up. We need to remove some of those.
We have to keep the current kernel version and at least one or two previous versions as good practice.
This will free up enough space of /boot until you repartition.
Linux: Configure locale and keyboard layout when remotely accessing from a Mac
At work I have to remote into several different Linux systems from a Mac and there is always the pain of having to handle different keyboard layouts if using Synergy or VMs.
The conversion from a Mac keyboard layout doesn’t translate correctly when the Linux system has the keyboard configured as a PC.
The XKBOPTIONS I have here are for Synergy to keep the Control and Alt keys on the Mac working the same on the Linux systems. You might not need or want it. Just remove it from the commands if that is the case.
You can also do a text GUI configuration of the keyboard with:
# dpkg-reconfigure keyboard-configuration
If your environment isn’t in English the menus won’t be either. You can force the language output of the application launched to be in the default one. That would be English in most cases. The same command as above but forcing the output to be in English:
You might need to restart if editing i18n, but the change should be automatic with loadkeys.
CentOS 7
Edit the following file:
/etc/locale.conf
with the following:
LANG="en_GB.UTF-8"
Or type the following command:
# localectl set-locale LANG=en_GB.UTF-8
Set the keymap:
# localectl set-x11-keymap gb macintosh mac lv3:alt_switch
In CentOS 7 it isn’t necessary to reboot, the above command automatically loads the key mappings.
CentOS 8
The same commands used for CentOS 7 fail. I suspect that there is a file or folder with the keyboard mappings that has been moved. It might be a bug or a deprecated feature.
Server and minimal installs are normally headless and have no graphical interface.
If needed you can add a GUI manually. The process is slightly different depending on the distro.
RedHat / CentOS 7.x
# yum update
# yum groupinstall "Server with GUI"
RedHat / CentOS 8.x
# dnf update
# dnf groupinstall workstation
Ubuntu 18.04.x LTS
# apt update
[Install minimum GNOME desktop]
# apt install --no-install-recommends ubuntu-desktop
[Install full desktop with associated applications]
(Long process and too many extras installed)
# apt install ubuntu-desktop
[There are other alternative desktops and installations possible:]
[Generic Gnome desktop]
# apt install vanilla-gnome-desktop
[Mate]
# apt install ubuntu-mate-desktop
[Xfce]
# apt install xubuntu-desktop
[KDE]
# apt install kubuntu-desktop
[LightDM]
# apt install --no-install-recommends lightdm
Debian 9.x
# apt update
# apt install gnome-core
Debian 10.x
# apt update
# apt install gnome-core
All the above distros use systemd as their init system and you set the default run level with the same set of commands.
[Enable run level 5 by default]
systemctl set-default graphical.target
[Enable run level 3 by default]
systemctl set-default multi-user.target
Despite systemd you can still use init to start the graphical interface without having to reboot.
# init 5
Ubuntu: Change default language/dictionary in Firefox
If you install Firefox via repositories it might not install with your preferred language/settings. If your distro is in English it will default to American English as the main language and other English variations as additional spelling dictionaries.
You can change this from the CLI. The below examples will leave British English as the default buy it can be easily be adapted to your needs.
Be warned that the spellchecker is shared with other applications like LibreOffice.
Check which spellchecker is installed on your system. Older versions used myspell and current ones use hunspell.
$ apt list --installed | grep myspell
$ apt list --installed | grep hunspell
I have had this USB wireless adapter working fine on Ubuntu 18.04 LTS for a while. A system update stopped it from working.
Re-installing the OS provided drivers (Software & Updates / Additional Drivers) made no difference. I tested the adapter in other operating systems and it worked fine.
It seems that from kernel version 4.15 onwards the drivers provided with Ubuntu no longer work, but the GUI shows as if the driver is correctly installed, it can see wireless networks and it even tries to connect to them. It will invariably fail to connect to any of them.
Others have encountered and solved this issue before me:
Find below the steps to troubleshoot similar issues and a summary of the steps to install the correct driver as per the above links.
Check the hardware
Unplug and re-plug the adapter and check the output of:
dmesg
The following commands will also help in showing if the adapter is correctly detected.
$ lsusb
Bus 004 Device 002: ID 1058:25e1 Western Digital Technologies, Inc.
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 002: ID 2357:0103
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 002: ID 093a:2510 Pixart Imaging, Inc. Optical Mouse
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
The Bus 003 Device 002: ID 2357:0103 entry is the one that is the USB wifi adapter on my system even if it isn’t showing an identifier. You can remove the adapter and issue the command again and compare results to help you identify it.
For non-USB adapters you can use:
$ lspci
More detailed information about the device can be obtained with the lshw command.
$ lshw -C network
WARNING: you should run this program as super-user.
*-network
description: Ethernet interface
product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
vendor: Realtek Semiconductor Co., Ltd.
[output truncated]
*-network:2
description: Wireless interface
physical id: 4
bus info: usb@3:1
logical name: enx18d6c70fbacc
serial: 18:d6:c7:a1:22:ab
capabilities: ethernet physical wireless
configuration: broadcast=yes driver=rtl8812au ip=192.168.x.2 multicast=yes wireless=IEEE 802.11AC
WARNING: output may be incomplete or inaccurate, you should run this program as super-user.
This last command is really useful because it will give really important information about what driver to use.
In this case the chipset and driver to use is identified in this string driver=rtl8812au.
Check the drivers
Now check that the driver is loaded, you need to look for a string that is similar to the driver string above.
$ lsmod | grep 8812
8812au 999424 0
If the module isn’t loaded you can use modprobe modulename to load it.
# modprobe 8812au
Installing updated drivers
But in my case all of the above was correct but the card would still not work. This was caused by an incompatibility of the the drivers provided with Ubuntu and the updated kernel.
I should have checked the system logs earlier as I believe there was an entry there indicating a problem.
Uninstall the system provided drivers from the GUI .
Go to Software & Updates
Select Additional Drivers
Find the entry for the wifi adapter (rtl8812-au) and select Do not use the device
You can do the same from the CLI:
[find the one you have installed]
# apt list rtl8812au*
[and uninstall]
# apt purge rtl8812au-dkms
Get the updated drivers from github:
$ git clone https://github.com/gnab/rtl8812au.git
Install the drivers with one of these two commands. They will work as long as you are pointing to the directory generated by the previous git command.
At the time of writing the latest release of the drivers are 4.2.3. Your output might vary.
And finally add the module to autoload during boot.
# echo 8812au | tee /etc/modules
You should now be able to join your wireless network without problems. As the driver is installed via dkms if there is a kernel update it will automatically update and recompile the driver for the new release.
If you ever need to uninstall the driver you can do it with:
# dkms remove -m 8812au -v 4.2.3 --all
You will also need to edit out the entry in /etc/modules.
One additional thing that caught me off was that if the adapter is connected to a USB 3.1 port it won’t work. USB 3.0 ports are fine.