Skip to content

init.d/cgroups: move cgroup2 root processes into rc.init#1014

Open
pva wants to merge 3 commits into
OpenRC:masterfrom
pva:fix-cgroup2-init-procs
Open

init.d/cgroups: move cgroup2 root processes into rc.init#1014
pva wants to merge 3 commits into
OpenRC:masterfrom
pva:fix-cgroup2-init-procs

Conversation

@pva
Copy link
Copy Markdown

@pva pva commented May 29, 2026

Before OpenRC enables cgroup v2 subtree controllers, move all PIDs out of the cgroup v2 root into a dedicated rc.init cgroup.

This is needed because a delegated cgroup v2 hierarchy, such as the one seen inside an Incus/LXC container, is still subject to the no-internal-processes rule. If PID 1 or other early processes remain in the cgroup namespace root, writes to cgroup.subtree_control fail with EBUSY, and child cgroups do not receive delegated controllers. This breaks resource accounting for nested runtimes such as Docker/containerd.

The patch creates rc.init under the cgroup v2 mount point, moves all currently listed root cgroup.procs entries there, and only then enables subtree controllers.

This is done as part of the normal cgroup v2 setup rather than as a container-specific workaround because OpenRC cannot reliably distinguish a delegated container cgroup namespace root from the real host root. It also matches the general model used by systemd's init.scope: init and early processes should not live in the cgroup used as a parent for delegated children.

The fix is applied to both unified and hybrid cgroup modes. It only affects the cgroup v2 hierarchy.

Fixes #1013

Testing:

  • Not yet tested in the affected Incus setup.
  • The same manual workaround fixed the issue there.

Use cgroup_path consistently for cgroup2_find_path() results, and replace
generic controller loop variables with more descriptive names.

No functional change.
@pva
Copy link
Copy Markdown
Author

pva commented May 29, 2026

I've done tests; the patch needs more work. I'll keep it for now, since it is good to illustrate the idea, but I'll send an updated fix later.

When OpenRC runs in a delegated cgroup v2 hierarchy, processes can remain
in the cgroup namespace root. This prevents enabling domain controllers in
cgroup.subtree_control due to the "no internal processes" rule (man 7
cgroups).

Create an rc.init cgroup and move existing root cgroup processes there before
enabling subtree controllers in unified and hybrid mode. Do this for cgroup v2
directly instead of trying to detect containers, since OpenRC cannot reliably
distinguish a delegated namespace root from the real cgroup root.

Use /rc.init as the marker for initialized OpenRC cgroup v2 setup. Do not create
per-service cgroups before rc.init exists, and move transient OpenRC service
runner processes back to rc.init when removing temporary service cgroups, since
the cgroup v2 root becomes parent-only after subtree controllers are enabled.

Fixes OpenRC#1013.
@pva pva force-pushed the fix-cgroup2-init-procs branch from 3417eb5 to e6aa66e Compare May 29, 2026 20:09
@pva
Copy link
Copy Markdown
Author

pva commented May 29, 2026

Well, now I've checked - inside container everything works correctly. I'll check on host later.

@pva
Copy link
Copy Markdown
Author

pva commented May 30, 2026

Well, now I have tested this both in an Incus container and on a real host boot.

Testing on a real host exposed one more issue: the cgroup v2 root contains kernel threads. Those can not be moved and must not be moved into rc.init, since writing kernel thread PIDs to a child cgroup's cgroup.procs fails with EINVAL.

/etc/init.d/cgroups: line 96: printf: write error: Invalid argument

I fixed that in the latest commit by skipping kernel threads before migration.

For the kernel-thread detection I followed the same approach systemd uses: read /proc/$pid/stat, take the flags field, and check PF_KTHREAD.

Now I think this patchset is complete.

Kernel threads can appear in the cgroup v2 root on a real host, but they
cannot be moved into a child cgroup. Skip such pids before writing to
rc.init/cgroup.procs to avoid EINVAL during boot.

Detect kernel threads from /proc/pid/stat flags using PF_KTHREAD, matching
the approach used by systemd.
@pva pva force-pushed the fix-cgroup2-init-procs branch from ef8417e to 751e27b Compare May 30, 2026 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenRC breaks cgroups resource accounting with cgroups v2 for an Incus/LXC container

1 participant