On Ubuntu 20.04 after installing the NVIDIA driver 510 metapackage the system stopped booting.
It will either hang with a black screen and blinking cursor on the top left or show the following error message:
[FAILED] Failed to start Mark current ZSYS boot as successful. See 'systemctl status zsys-commit.service' for details. [ OK ] Stopped User Manager for UID 1000.
Attempting to revert from a snapshot ends up with the same error message. This wasn’t the case on another separate system that had the same upgrade.
The “20.04 zys-commit.service fails” message is quite interesting and it seems that the overall cause is a mismatch of user/kernel zfs components.
These are the steps I followed to fix it. Many thanks to Lockszmith for his research in identifying the issue and finding a fix. He created two posts raising it, links provided here.
Restart Ubuntu and boot in recovery mode
[In GRUB] *Advanced options for Ubuntu 20.04.3 LTS [Select the first recovery option in the menu] *Ubuntu 20.04.3 LTS, with Linux 5.xx.x-xx-generic (recovery mode) [Wait for the system to load the menu and select:] root [Press Enter for Maintenance to get the CLI]
Check the reason for the error.
# systemctl status zsys-commit.service [...] Feb 17 11:11:24 ab350 systemd zsysctl: level=error msg="couldn't commit: couldn't promote dataset "rpool/ROOT/ubuntu_733qyk": couldn't promote "rpool/ROOT/ubuntu_733qyk": not a cloned filesystem" [...]
Attempting to promote it manually fails:
# zfs promote rpool/ROOT/ubuntu_733qyk cannot promote `rpool/ROOT/ubuntu_733qyk` : not a cloned filesystem
Uninstall the NVIDIA drivers.
# dkms uninstall nvidia/510.47.03 # dkms remove nvidia/510.47.03 --all
Make sure you can connect to the internet. You can temporarily assign a DHCP address to one of the network interfaces.
# dhclient -v eno1 # ip address
Update the system and install a 3rd party ZFS set of tools.
# apt update # apt upgrade # apt autoremove [Add 3rd party PPA for zfstools] # add-apt-repository ppa:jonathonf/zfs # apt update [Upgrade ZFS] # apt upgrade [If ZFS isn't upgraded, do it manually] # apt install zfs-initramfs zfs-zed zfsutils-linux
It might take a bit to update. Reboot normally.
It should boot normally.
If this doesn’t work for you, reboot in recovery mode again and promote the filesystem manually.
# zfs promote rpool/ROOT/ubuntu_733qyk
Sort graphical drivers
Revert to NVIDIA metapackage 470 (if this is what broke your system). Reboot, and fix resolution settings.
Upgrading back to 510 will bring the error back and make it even more difficult to fix. Don’t!
Things will only work if zfs and zfs-kmod match versions.
$ zfs --version zfs-0.8.3-1ubuntu12.13 zfs-kmod-2.0.6-1ubuntu2
[boot in recovery mode] # apt reinstall zfs-initramfs zfs-zed zfsutils-linux # zfs promote rpool/ROOT/ubuntu_733qyk [reboot in normal mode] [Configure the 470 drivers]
Reverting to previous ZFS version
The system should now be back to normal, but you might want to revert to the mainline ZFS version despite the bug. After all, this was a hack to promote the filesystem and get it back to work.
# add-apt-repository --remove ppa:jonathonf/zfs [Check that is has been removed] $ apt policy # apt update [Pray] # apt remove zfs-initramfs zfs-zed zfsutils-linux # apt install zfs-initramfs zfs-zed zfsutils-linux [Check the right version is installed] # apt list --installed | grep zfs # apt autoremove [Pray harder] # reboot
With that I managed to bring my system back to a working condition, but updating the drivers a second time made it fail again and I couldn’t fix it. A clean install of 20.04.3 doesn’t seem to exhibit this problem. Not sure what is the reason behind it but there are a few bugs open with Canonical regarding this.
I hope that 22.04 will bring a better ZFS version.