Skip to content

interface/jail: don't hijack a host interface when the jailed device is missing#83

Closed
dangowrt wants to merge 2 commits into
openwrt:masterfrom
dangowrt:jail-fixes
Closed

interface/jail: don't hijack a host interface when the jailed device is missing#83
dangowrt wants to merge 2 commits into
openwrt:masterfrom
dangowrt:jail-fixes

Conversation

@dangowrt

@dangowrt dangowrt commented Jun 1, 2026

Copy link
Copy Markdown
Member

Background

When an interface carries option jail '<name>', host netifd moves that
interface's device into the container's network namespace when the
container is created. procd's ujail triggers this through the
network.netns_updown ubus call, which lands in interface_start_jail()
-> system_link_netns_move().

This is the mechanism behind the bug reported in
openwrt/procd#38, where bringing up a netifd-enabled (uxc) container
either left the container with no network device — the in-jail udhcpc
logging udhcpc: SIOCGIFINDEX: No such device /
can't bind to interface <dev>: No such device — or, in the worst case,
pulled the host's own eth0 into the container and knocked the host off
the network.

What was broken

system_link_netns_move() built its RTM_NEWLINK message from whatever
system_if_resolve() returned for the source device:

index = system_if_resolve(dev);
msg = __system_ifinfo_msg(AF_UNSPEC, index, target_ifname, RTM_NEWLINK, 0);
nla_put_u32(msg, IFLA_NET_NS_FD, netns_fd);

A jailed interface is forced autostart = false (see
interface_alloc() / interface_change_config()), so host netifd never
brings it up and never instantiates its device on its own. A config device veth is only created once its primary end is actually claimed by
something (e.g. a bridge port). If the operator's config never causes the
veth to be created on the host before the container starts, the jailed
interface's device simply does not exist at interface_start_jail()
time, and system_if_resolve() returns 0.

With index == 0, the message still carries IFLA_IFNAME = target_ifname (the jail_device name). The kernel's __rtnl_newlink()
selects the target device by name when ifi_index is 0
(net/core/rtnetlink.c: rtnl_dev_get() -> __dev_get_by_name()), so
the move operates on whatever host device happens to be named like the
jail_device. With a jail_device of eth0 this moves the host's real
eth0 into the container's netns. That is exactly the failure mode in
openwrt/procd#38: the container is created, the host loses its uplink,
and ssh to the host dies while the box itself stays up.

How it is fixed

Two commits:

  1. system-linux: refuse netns move of an unresolved device.
    Bail out of system_link_netns_move() when the source device cannot
    be resolved to a real ifindex, so a request is never sent with
    ifi_index == 0 and a name-based selector. This removes the
    host-interface-hijack entirely: a missing jailed device can no longer
    cause an unrelated host device to be moved.

  2. interface: defer the jailed device move until the device exists.
    When the move cannot be performed yet, keep a duplicated reference to
    the jail netns on the interface (netns_fd + jail_pending) and
    complete the move from interface_main_dev_cb() once the device shows
    up (DEV_EVENT_ADD). interface_stop_jail() and interface_free()
    drop any still-pending reference. The pending state survives a
    network reload because interface_update() -> interface_change_config()
    keeps the existing interface object. This turns the previously-required
    manual workaround (ubus call network.interface.<jail-iface> up before
    container creation, mentioned in the ticket) into automatic behaviour:
    as soon as the device is created the container receives it.

Testing

Reproduced openwrt/procd#38 on an x86_64 VM with a uxc/netifd container
(org.openwrt.procd.netifd annotation), a veth pair, and a jailed
interface with jail_device 'eth0' whose veth was deliberately not
pre-created:

  • Before: uxc create moved the host's eth0 into the container; the
    host lost networking.
  • After commit 1: the host's eth0 is untouched (verified by MAC across
    the create); the container's netns has only lo.
  • After commit 2: once the veth is created (host end claimed by br-lan),
    the deferred move fires automatically, the veth end lands in the
    container as eth0, the in-jail netifd completes DHCP, and the host's
    eth0 remains in place.

A correctly-configured setup (veth host end as a br-lan port so the
pair is created at boot) works on unpatched netifd too; these changes
make the misconfigured / device-not-yet-present cases fail safe instead
of hijacking a host interface.

dangowrt added 2 commits June 1, 2026 14:47
system_link_netns_move() built its RTM_NEWLINK message with whatever
system_if_resolve() returned for the source device. When the device does
not exist (e.g. a jailed interface whose veth has not been created yet),
system_if_resolve() returns 0, and the message goes out with
ifi_index = 0 while IFLA_IFNAME is set to target_ifname (the jail_device
name).

The kernel's __rtnl_newlink() then selects the target device by name when
ifi_index is 0 (net/core/rtnetlink.c: rtnl_dev_get -> __dev_get_by_name),
so the move operates on whatever host device happens to be named like the
jail_device. With a jail_device of "eth0" this moves the host's real eth0
into the container's network namespace, knocking the host off the
network.

Bail out when the source device cannot be resolved to a real ifindex so
the request is never sent with a zero index.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
interface_start_jail() moved each jailed interface's main device into the
container netns at the moment the jail's netns was registered. If the
device did not exist yet (e.g. a veth whose creation had not been
triggered), the move was a no-op and the container came up with no
network device; the in-jail netifd then failed DHCP with
"udhcpc: SIOCGIFINDEX: No such device" until the operator manually
brought the host interface up.

Keep a duplicated reference to the jail netns on the interface when the
move cannot be performed yet, and complete it from interface_main_dev_cb
once the device appears (DEV_EVENT_ADD). interface_stop_jail and
interface_free drop any still-pending reference.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
@dangowrt

Copy link
Copy Markdown
Member Author

Closing in favor of #85

@dangowrt dangowrt closed this Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant