Immutable root with atomic upgrades

From Alpine Linux
Revision as of 14:13, 20 April 2021 by Mvsn (talk | contribs)
This page is a work in progress ...

This page is still being developed.

What?

This article provides a basic guide to setting up a read-only-root-based Alpine Linux system with several boot environments and atomic upgrades using rEFInd and btrfs.

Why?

Read-only root and atomic upgrades with ability to easily rollback or boot previous configurations is a concept that got some popularity recently. Distributions providing and promoting such features, for example, are Fedora Silverblue, Opensuse MicroOS, NixOS and GNU Guix. While Alpine Linux has it's killer features it lacks mentioned above ones on default setup. This is a proof of concept that it's possible to implement them in a minimal way on a minimal system.

Partitioning disks

The first step is creating partition table on target device /dev/sda:

# apk add gptfdisk
# gdisk /dev/sda
> o ↵
> y ↵
> w ↵
> y ↵

Now we can define the partitions:

# cgdisk /dev/sda

Partition creation process consists of several steps:

  1. Start sector - you can safely use default value by pressing ↵
  2. Size
  3. Type (as hex code) - EFI is ef00, Linux filesystem is 8300, Swap is 8200.

Result table:

Part.     #     Size        Partition Type            Partition Name
----------------------------------------------------------------
1               200.0 MiB   EFI System                EFI
2               200.0 GiB   Linux filesystem          ROOT
3               32.0 GiB    Linux swap                SWAP

ROOT partition name will later be used in rEFInd configuration to identify boot volume. Next step is creating filesystems:

# mkfs.vfat -F32 /dev/sda1
# mkfs.btrfs /dev/sda2
# mkswap /dev/sda3

Now we can mount our root volume:

# mount -t btrfs /dev/sda2 /mnt

File system structure

Now we should create file structure that would provide reliable atomic system upgrades.
Start with following directories:
# mkdir /mnt/commons stores common non-snapshotting subvolumes.
# mkdir /mnt/next stores next current link.
Latter is necessary due to how busybox mv does atomic link replacement.
Next, most important directories:
# mkdir /mnt/snapshots stores directories containing snapshots belonging to one generation.
# mkdir /mnt/links stores generations of directories containing links to snapshot generations.
Let's create first generation and populate it with one OS root snapshot @:

# NEWSNAPSHOTS="$(date -u +"%Y%m%d%H%M%S")$(cat /dev/urandom | tr -dc 'a-zA-Z' | fold -w 8 | head -n 1)"
# mkdir "/mnt/$NEWSNAPSHOTS"
# btrfs subvolume create /mnt/$NEWSNAPSHOTS/@

Populate links:

# NEWLINKS="$(date -u +"%Y%m%d%H%M%S")$(cat /dev/urandom | tr -dc 'a-zA-Z' | fold -w 8 | head -n 1)"
# mkdir "/mnt/links/$NEWLINKS"
# ln -s "../../snapshots/$NEWSNAPSHOTS" "/mnt/links/$NEWLINKS/0"
# ln -s "../../snapshots/$NEWSNAPSHOTS" "/mnt/links/$NEWLINKS/1"
# ln -s "../../snapshots/$NEWSNAPSHOTS" "/mnt/links/$NEWLINKS/2"
# ln -s "../../snapshots/$NEWSNAPSHOTS" "/mnt/links/$NEWLINKS/3"

You can have as many links as you like, just apply changes to rEFInd config and upgrade scripts described below accordingly.
Link that will point to latest links generation:

# ln -s "./links/$NEWLINKS" /mnt/current

This setup allows us to just have static rEFInd config that points to /current/0, /current/1, etc. while the actual underlying boot environment will change with each upgrade.
But how will fs mounting services know which snapshot generation is currently loaded?
The answer is common fstab:

WIP
|--mnt
| |--commons
| |--current
| |--fstab
| |--links
| | |--20210411213742qwrXAJBz
| | | |--0
| | | |--1
| | | |--2
| | | |--3
| |--next
| |--snapshots
| | |--20210411212549sdBXyLxg
| | | |--@