Kata Containers is actually now the main way to run containers in an isolated virtual machine for more security. My work requires me to run container images, which I can't always fully trust. I used to use a virtual machine in Virtualbox in conjunction with Docker-Machine for this, but since Docker-Machine is no longer being developed as of 2021, I decided to look at replacement options.
And I was very surprised to find out that OpenSuse Tumbleweed, which I migrated to after 16 years with Ubuntu, doesn't have any ready-made packages for Kata Containers. Moreover, that said, there are containers for Firecracker microvm, specifically created in AWS to run Linux containers in an isolated environment. But since the packages for Firecracker also came without the Firecracker-Containerd runtime, I went back to the idea of trying Kata Containers.
In 2024, the official Kata Containers documentation states that pre-built packages are only available in the repositories of Fedora and Centos distributions, and for Ubuntu via Snap packages. In other cases, such as mine, the only options are to build from source or install already built binaries from Github using special script kata-manager.sh
. The developers themselves suggest the second option in the documentation.
The kata-manager.sh
script itself will simply unpack the selected release archive from Github into /opt/kata
, create configs in /etc/kata-containers/
and set up symlinks in /usr/bin/
for kata-runtime
, kata-collect-data.sh
and containerd-shim-kata-v2
. This will already be enough for calling docker run --runtime io.containerd.run.kata.v2
to create a container in a Qemu virtual machine (default) with 1 cpu and 2 gb ram. Interesting fact: containerd searches for runtime in PATH using a special scheme, and io.containerd.run.kata.v2
turns into a search for containerd-shim-kata-v2
in PATH without having to configure anything in the configs.
What makes this installation worse than installing from packages from your Linux distribution: at the moment Kata Containers comes with its own complete set of hypervisors (Qemu, Firecracker, etc...), startup images and other necessary files, without depending on external components. And of course all these components need to be updated for security purposes. And in this case, you have to update at your own risk, completely replacing the files of the previous version with untested new ones. For example, in my case Fireckracker did not work in version 3.6.0 out of the box. In the case of distribution packages, there is at least a chance that they will be tested properly. Another consequence of the update will be the potential need to restart all already running containers with the new version, but this will be true for package installations as well.
Of course, a much better solution is to install each version in its own separate path. So, install the version we need from github:
VERSION=3.8.0
DIR="/opt/kata_$VERSION"
PACKAGE="kata-static-$VERSION-$(uname -m | sed -e 's/x86_64/amd64/' -e 's/aarch64\|arm64/arm64/' -e 's/ppc64le/ppc64le/' -e 's/s390x/s390x/').tar.xz"
curl -LO https://github.com/kata-containers/kata-containers/releases/download/$VERSION/$PACKAGE
sudo mkdir -p "$DIR"
sudo tar -xJf "$PACKAGE" --strip-components=3 -C "$DIR"
The contents should unpack successfully:
$ ll $DIR
total 20
drwxr-xr-x 1 root root 110 Aug 25 12:24 ./
drwxr-xr-x 1 root root 150 Aug 25 12:15 ../
drwxr-xr-x 1 root root 490 Aug 21 17:53 bin/
drwxr-xr-x 1 root root 50 Aug 15 20:52 include/
drwxr-xr-x 1 root root 70 Aug 15 20:52 lib/
drwxr-xr-x 1 root root 30 Aug 9 12:45 libexec/
drwxr-xr-x 1 root root 6 Aug 21 18:01 runtime-rs/
drwxr-xr-x 1 root root 154 Aug 21 18:02 share/
-rw-r--r-- 1 root root 29 Aug 21 18:08 VERSION
-rw-r--r-- 1 root root 13678 Aug 21 18:08 versions.yaml
There are many available hypervisors to choose from, the default is Qemu:
$ ll $DIR/share/defaults/kata-containers/
total 420
drwxr-xr-x 1 root root 906 Aug 21 18:02 ./
drwxr-xr-x 1 root root 30 Aug 21 18:01 ../
-rw-r--r-- 1 root root 10930 Aug 21 18:02 configuration-acrn.toml
-rw-r--r-- 1 root root 19799 Aug 21 18:02 configuration-clh.toml
-rw-r--r-- 1 root root 16708 Aug 21 18:02 configuration-fc.toml
-rw-r--r-- 1 root root 29756 Aug 21 18:02 configuration-qemu-coco-dev.toml
-rw-r--r-- 1 root root 28990 Aug 21 18:02 configuration-qemu-nvidia-gpu-snp.toml
-rw-r--r-- 1 root root 28964 Aug 21 18:02 configuration-qemu-nvidia-gpu-tdx.toml
-rw-r--r-- 1 root root 29631 Aug 21 18:02 configuration-qemu-nvidia-gpu.toml
-rw-r--r-- 1 root root 28098 Aug 21 18:02 configuration-qemu-se.toml
-rw-r--r-- 1 root root 27635 Aug 21 18:02 configuration-qemu-sev.toml
-rw-r--r-- 1 root root 29032 Aug 21 18:02 configuration-qemu-snp.toml
-rw-r--r-- 1 root root 28798 Aug 21 18:02 configuration-qemu-tdx.toml
-rw-r--r-- 1 root root 29649 Aug 21 18:02 configuration-qemu.toml
-rw-r--r-- 1 root root 13910 Aug 21 18:02 configuration-remote.toml
-rw-r--r-- 1 root root 17523 Aug 21 18:02 configuration-stratovirt.toml
lrwxrwxrwx 1 root root 23 Aug 21 18:02 configuration.toml -> configuration-qemu.toml
-rw-r--r-- 1 root root 10232 Aug 21 18:03 genpolicy-settings.json
-rw-r--r-- 1 root root 36237 Aug 21 18:03 rules.rego
drwxr-xr-x 1 root root 280 Aug 21 18:01 runtime-rs/
You can read more about the configuration here and here.
Now all we have to do is create the necessary symlinks:
sudo ln -s $DIR /opt/kata
sudo ln -s /opt/kata/share/defaults/kata-containers /etc/kata-containers
sudo ln -s /opt/kata/bin/containerd-shim-kata-v2 /usr/bin/containerd-shim-kata-v2
sudo ln -s /opt/kata/bin/kata-runtime /usr/bin/kata-runtime
Verify that everything worked by running the busybox container:
$ sudo docker run --runtime io.containerd.run.kata.v2 busybox uname -a
Linux 88c1b982e983 6.1.62 #1 SMP Wed Jul 17 13:00:20 UTC 2024 x86_64 GNU/Linux
To make Kata the default runtime, you need to add “default-runtime”: “io.containerd.run.kata.v2”
to /etc/docker/daemon.json
:
$ cat /etc/docker/daemon.json
{
"default-runtime": "io.containerd.run.kata.v2"
}
And reboot docker
:
sudo systemctl reload docker
Check it out:
$ sudo docker run busybox uname -a
Linux 88c1b982e983 6.1.62 #1 SMP Wed Jul 17 13:00:20 UTC 2024 x86_64 GNU/Linux
In the future, when installing a new version, all we need to do is change the symlink:
sudo rm -f /opt/kata
sudo ln -s $DIR /opt/kata
If we want to be able to use different versions of the Kata runtime without switching the /opt/kata
symlink, we have to use a couple of hacks. The difficulty lies in the fact that the /opt/kata
path, as well as two configuration file paths, are hardwired into the Kata executables:
$ kata-runtime --show-default-config-paths
/etc/kata-containers/configuration.toml
/opt/kata/share/defaults/kata-containers/configuration.toml
The configs can be overridden using the KATA_CONF_FILE
variable. Also containerd-shim-kata-v2
will look for kata-runtime
in the PATH. All this can be solved with a wrapper script.
Now we need to update all the files to use our new path instead of /opt/kata
:
grep -rlI '/opt/kata' "$DIR" | sudo xargs sed -i "s|/opt/kata|$DIR|g"
Unfortunately, this won't solve the problem with Qemu's hardcoded path for finding bios, so we still need the /opt/kata
symlink. If you haven't created one yet, it's time:
sudo ln -s $DIR /opt/kata
The good news is that thanks to the config changes, kata-runtime will use its own system images and binaries rather than the default ones in /opt/kata
.
What remains is to create scripts to run containerd-shim-kata-v2
:
cat <<EOF | sudo tee "/usr/bin/containerd-shim-kata_$(echo "$VERSION" | sed 's/\./_/g')-v2" > /dev/null
#!/bin/sh
export KATA_CONF_FILE="$DIR/share/defaults/kata-containers/configuration.toml"
export PATH="$DIR/bin:\$PATH"
exec $DIR/bin/containerd-shim-kata-v2 "\$@"
EOF
sudo chmod +x "/usr/bin/containerd-shim-kata_$(echo "$VERSION" | sed 's/\./_/g')-v2"
And kata-runtime
:
cat <<EOF | sudo tee "/usr/bin/kata-runtime-$VERSION" > /dev/null
#!/bin/sh
export KATA_CONF_FILE="$DIR/share/defaults/kata-containers/configuration.toml"
export PATH="$DIR/bin:\$PATH"
exec $DIR/bin/kata-runtime "\$@"
EOF
sudo chmod +x "/usr/bin/kata-runtime-$VERSION"
Check it out:
$ echo "io.containerd.run.kata_$(echo "$VERSION" | sed 's/\./_/g').v2"
io.containerd.run.kata_3_8_0.v2
$ sudo docker run --runtime io.containerd.run.kata_3_8_0.v2 busybox uname -a
Linux 88c1b982e983 6.1.62 #1 SMP Wed Jul 17 13:00:20 UTC 2024 x86_64 GNU/Linux
By default, Kata runtime runs a virtual machine with 1 cpu and 2 gb ram. To override these parameters, we need to pass docker run
the parameters --cpus
(-c
didn't work for me) and -m
or -memory
, and the parameters given will be added to the default parameters, i.e. --cpus 1 --memory 512m
will result in the use of 2 cpu and 2.5 gb memory. The host's actual memory is not blocked by this 2.5 gb, but is used as needed.
Everything else, like networking, should work by default. For example, we can get a response from nginx
:
$ sudo docker run -d --name nginx nginx
8663b33cad7e5b820be85468aa760e6ea34fc870adf7d924922788266041c898
$ sudo docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' nginx
172.17.0.2
$ curl 172.17.0.2
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
Mounting also works fine:
$ sudo docker run --runtime io.containerd.run.kata_3_8_0.v2 -v /tmp:/tmp busybox ls /tmp
dbus-1b94OvKJfW
sddm-auth-ad8346f2-fbc8-4b74-9c03-e33d1136248e
systemd-private-a2ad86d46c374394957b41737f945998-ModemManager.service-tsH78Z
systemd-private-a2ad86d46c374394957b41737f945998-bluetooth.service-Bct8TG
systemd-private-a2ad86d46c374394957b41737f945998-dbus-broker.service-mpkAbN
systemd-private-a2ad86d46c374394957b41737f945998-fwupd.service-1Uiyrr
systemd-private-a2ad86d46c374394957b41737f945998-iio-sensor-proxy.service-1IKcZf
systemd-private-a2ad86d46c374394957b41737f945998-irqbalance.service-VvFZzU
systemd-private-a2ad86d46c374394957b41737f945998-polkit.service-58al8n
systemd-private-a2ad86d46c374394957b41737f945998-power-profiles-daemon.service-U13kul
systemd-private-a2ad86d46c374394957b41737f945998-systemd-logind.service-r2T6Xu
systemd-private-a2ad86d46c374394957b41737f945998-upower.service-aUQ6HX
tmp.CERJypVFNb
tmp.Lk54RYD5Tl
tmp.OWS9PKcNm8
tmp.cUYADP9m4f
tmp.hTZnG0skQ4
tmp.k3XnzrTg2k
tmp.mFpNhgIVQT
tmp.mvGoSlep7Z
tmp.nIDfmAGIo2
tmp.yQNEJyMj86
$ sudo docker run --runtime io.containerd.run.kata_3_8_0.v2 -v /tmp:/tmp busybox mount
none on / type virtiofs (rw,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,nosuid,size=65536k,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/devices type cgroup (ro,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (ro,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (ro,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (ro,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (ro,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/freezer type cgroup (ro,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/pids type cgroup (ro,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/blkio type cgroup (ro,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/memory type cgroup (ro,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/cpuset type cgroup (ro,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
shm on /dev/shm type tmpfs (rw,relatime)
none on /tmp type virtiofs (rw,relatime)
kataShared on /etc/resolv.conf type virtiofs (rw,relatime)
kataShared on /etc/hostname type virtiofs (rw,relatime)
kataShared on /etc/hosts type virtiofs (rw,relatime)
tmpfs on /proc/timer_list type tmpfs (rw,nosuid,size=65536k,mode=755)
proc on /proc/bus type proc (ro,relatime)
proc on /proc/fs type proc (ro,relatime)
proc on /proc/irq type proc (ro,relatime)
proc on /proc/sys type proc (ro,relatime)
ps: if you are looking for Senior or Lead DevOps in Europe - welcome: