ZFS ‘Failed to start Mark current ZSYS boot as successful’ fix

On Ubuntu 20.04 after installing the NVIDIA driver 510 metapackage the system stopped booting.

It will either hang with a black screen and blinking cursor on the top left or show the following error message:

[FAILED] Failed to start Mark current ZSYS boot as successful.
See 'systemctl status zsys-commit.service' for details.
[  OK  ] Stopped User Manager for UID 1000.

Attempting to revert from a snapshot ends up with the same error message. This wasn’t the case on another separate system that had the same upgrade.

The “20.04 zys-commit.service fails” message is quite interesting and it seems that the overall cause is a mismatch of user/kernel zfs components.

These are the steps I followed to fix it. Many thanks to Lockszmith for his research in identifying the issue and finding a fix. He created two posts raising it, links provided here.

https://askubuntu.com/users/720005/lockszmith

https://askubuntu.com/questions/1388997/zsys-commit-service-fails-with-couldnt-promote-dataset-not-a-cloned-filesy

Fix

Restart Ubuntu and boot in recovery mode

[In GRUB]

*Advanced options for Ubuntu 20.04.3 LTS

[Select the first recovery option in the menu]
*Ubuntu 20.04.3 LTS, with Linux 5.xx.x-xx-generic (recovery mode)

[Wait for the system to load the menu and select:]
root

[Press Enter for Maintenance to get the CLI]

Check the reason for the error.

# systemctl status zsys-commit.service
[...]
 Feb 17 11:11:24 ab350 systemd[1] zsysctl[4068]: level=error msg="couldn't commit: couldn't promote dataset "rpool/ROOT/ubuntu_733qyk": couldn't promote "rpool/ROOT/ubuntu_733qyk": not a cloned filesystem"
 [...]

Attempting to promote it manually fails:

# zfs promote rpool/ROOT/ubuntu_733qyk

cannot promote `rpool/ROOT/ubuntu_733qyk` : not a cloned filesystem

Uninstall the NVIDIA drivers.

# dkms uninstall nvidia/510.47.03
# dkms remove nvidia/510.47.03 --all

Make sure you can connect to the internet. You can temporarily assign a DHCP address to one of the network interfaces.

# dhclient -v eno1
# ip address

Update the system and install a 3rd party ZFS set of tools.

# apt update
# apt upgrade
# apt autoremove

[Add 3rd party PPA for zfstools]
# add-apt-repository ppa:jonathonf/zfs
# apt update 

[Upgrade ZFS]
# apt upgrade

[If ZFS isn't upgraded, do it manually]
# apt install zfs-initramfs zfs-zed zfsutils-linux

It might take a bit to update. Reboot normally.

# reboot

It should boot normally.

If this doesn’t work for you, reboot in recovery mode again and promote the filesystem manually.

# zfs promote rpool/ROOT/ubuntu_733qyk

Sort graphical drivers

Revert to NVIDIA metapackage 470 (if this is what broke your system). Reboot, and fix resolution settings.

Upgrading back to 510 will bring the error back and make it even more difficult to fix. Don’t!

Things will only work if zfs and zfs-kmod match versions.

$ zfs --version
zfs-0.8.3-1ubuntu12.13
zfs-kmod-2.0.6-1ubuntu2
[boot in recovery mode]
# apt reinstall zfs-initramfs zfs-zed zfsutils-linux
# zfs promote rpool/ROOT/ubuntu_733qyk

[reboot in normal mode]
[Configure the 470 drivers]

Reverting to previous ZFS version

The system should now be back to normal, but you might want to revert to the mainline ZFS version despite the bug. After all, this was a hack to promote the filesystem and get it back to work.

# add-apt-repository --remove ppa:jonathonf/zfs

[Check that is has been removed]
$ apt policy

# apt update

[Pray]
# apt remove zfs-initramfs zfs-zed zfsutils-linux
# apt install zfs-initramfs zfs-zed zfsutils-linux

[Check the right version is installed]
# apt list --installed | grep zfs

# apt autoremove

[Pray harder]
# reboot

With that I managed to bring my system back to a working condition, but updating the drivers a second time made it fail again and I couldn’t fix it. A clean install of 20.04.3 doesn’t seem to exhibit this problem. Not sure what is the reason behind it but there are a few bugs open with Canonical regarding this.

I hope that 22.04 will bring a better ZFS version.




Ubuntu: Ubuntu 20.4 installing NVIDIA drivers breaks built-in audio on laptop

On a new laptop I couldn’t get the external HDMI monitor to work with the nouveau drivers, so I installed the NVIDIA drivers (version 440).

The NVIDIA drivers worked perfectly and the external monitor could be configured, but it didn’t take too long to notice that the built-in audio wasn’t working.

Audio would only play through HDMI. Disconnecting the monitor wouldn’t make the built-in audio work again.

No audio from the built-in speakers or headphones on a laptop isn’t good.

Reverting to the nouveau drivers wouldn’t fix the audio. I have to say that I have become a fan of the ZFS rollback feature on 20.04 in a flash. You can revert the system to how it was before any update that borks things. You can try different troubleshooting solutions and go back if needed. Big fan.

So, how to get the audio to work again?

First find the audio driver being used. There are several ways to find what audio driver you are using:

$ inxi -iF
[...]
Audio:
  Device-1: Intel Cannon Lake PCH cAVS driver: snd_hda_intel 
  Device-2: NVIDIA TU106 High Definition Audio driver: snd_hda_intel 
  Sound Server: ALSA v: k5.4.0-28-generic 
[...]
$ lshw -c multimedia
  *-multimedia              
       description: Audio device
       product: TU106 High Definition Audio Controller
       vendor: NVIDIA Corporation
       physical id: 0.1
       bus info: pci@0000:01:00.1
       version: a1
       width: 32 bits
       clock: 33MHz
       capabilities: bus_master cap_list
       configuration: driver=snd_hda_intel latency=0
       resources: irq:17 memory:b4000000-b4003fff
  *-multimedia
       description: Audio device
       product: Cannon Lake PCH cAVS
       vendor: Intel Corporation
       physical id: 1f.3
       bus info: pci@0000:00:1f.3
       version: 10
       width: 64 bits
       clock: 33MHz
       capabilities: bus_master cap_list
       configuration: driver=snd_hda_intel latency=32
       resources: irq:150 memory:b4618000-b461bfff memory:b4200000-b42fffff
$ lspci -v
[...]
00:1f.3 Audio device: Intel Corporation Cannon Lake PCH cAVS (rev 10)
        Subsystem: CLEVO/KAPOK Computer Cannon Lake PCH cAVS
        Flags: bus master, fast devsel, latency 32, IRQ 150
        Memory at b4618000 (64-bit, non-prefetchable) [size=16K]
        Memory at b4200000 (64-bit, non-prefetchable) [size=1M]
        Capabilities: <access denied>
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel, snd_sof_pci
[...]
01:00.1 Audio device: NVIDIA Corporation TU106 High Definition Audio Controller (rev a1)
        Subsystem: CLEVO/KAPOK Computer TU106 High Definition Audio Controller
        Flags: bus master, fast devsel, latency 0, IRQ 17
        Memory at b4000000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
[...]

From the above we can see that the audio driver being used is snd_hda_intel, which is quite common.

Second, find out the audio codecs in use:

$ cat /proc/asound/card0/codec*

The output after installing the NVIDIA drivers that stopped the audio from working shows a lot of UNKNOWN and N/A entries:

Codec: Realtek ALC1220
Address: 0
AFG Function Id: 0x1 (unsol 1)
Vendor Id: 0x10ec1220
Subsystem Id: 0x155896e1
Revision Id: 0x100003
No Modem Function Group found
Default PCM:
N/A
Default Amp-In caps: N/A
Default Amp-Out caps: N/A
State of AFG node 0x01:
  Power: setting=UNKNOWN, actual=UNKNOWN, Error, Clock-stop-OK, Setting-reset
Invalid AFG subtree
Codec: Intel Kabylake HDMI
Address: 2
AFG Function Id: 0x1 (unsol 0)
Vendor Id: 0x8086280b
Subsystem Id: 0x80860101
Revision Id: 0x100000
No Modem Function Group found
Default PCM:
N/A
Default Amp-In caps: N/A
Default Amp-Out caps: N/A
State of AFG node 0x01:
  Power: setting=UNKNOWN, actual=UNKNOWN, Error, Clock-stop-OK, Setting-reset
Invalid AFG subtree

But a normal/working output would be similar to this:

Codec: Realtek ALC1220
Address: 0
AFG Function Id: 0x1 (unsol 1)
Vendor Id: 0x10ec1220
Subsystem Id: 0x155896e1
Revision Id: 0x100003
No Modem Function Group found
Default PCM:
    rates [0x5f0]: 32000 44100 48000 88200 96000 192000
    bits [0xe]: 16 20 24
    formats [0x1]: PCM
Default Amp-In caps: N/A
Default Amp-Out caps: N/A
State of AFG node 0x01:
  Power states: D0 D1 D2 D3 D3cold CLKSTOP EPSS
  Power: setting=D0, actual=D0
GPIO: io=8, o=0, i=0, unsolicited=1, wake=0
  IO[0]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[1]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[2]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[3]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[4]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[5]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[6]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
  IO[7]: enable=0, dir=0, wake=0, sticky=0, data=0, unsol=0
Node 0x02 [Audio Output] wcaps 0x41d: Stereo Amp-Out
  Control: name="Line Out Playback Volume", index=0, device=0
 [...]

From all of the above we can determine that the audio driver used is snd_hda_intel and that the codec is Realtek ALC1220.

It is very likely that your driver will be the same but the codec might vary. If using snd_hda_intel you can lookup what model variant you need searching the codec name in this list:

https://www.infradead.org/~mchehab/rst_conversion/sound/hd-audio/models.html

For the ALC1220 the model name to use seems to be dual-codecs.

Edit your ALSA configuration file:

# vim /etc/modprobe.d/alsa-base.conf

and add this to the end of the file:

# Manual entry to allow audio via headphones because NVIDIA drivers break the built-in audio
options snd-hda-intel model=clevo-p950
options snd-hda-intel probe_mask=0x1

I used the wrong model name by mistake. I meant to use dual-codecs but I used the model name just below in the list: clevo-p950. It worked and as it worked I haven’t gone back to edit it.

After updating your alsa configuration file reboot.

Just be more careful than me and choose the model name that matches your system.

After rebooting the audio from the built-in speakers and headphones were working.

You can change the output being used from your settings or using PulseAudio‘s volume control.