feat(python): Handler wrapper on the C ABI (RFC #43) by dzerik · Pull Request #46 · multikernel/sandlock

dzerik · 2026-05-15T13:21:46Z

Summary

Python wrapper for the Handler C ABI merged in #44. Python users can now write seccomp-notif handlers as Handler subclasses without touching ctypes directly.

Per RFC #43 phasing, this is the second of three PRs:

~~C ABI surface~~ — merged in feat(ffi): C ABI for the Handler trait (RFC #43) #44.
Python wrapper layer (this PR) — Handler base class, registration via Sandbox.run_with_handlers, audit smoke test.
Ergonomic layer — path-helper convenience methods, fixtures, full docs page. Deferred.

What's in this PR

C ABI extension — one new exception-policy discriminant:

SANDLOCK_EXCEPTION_DENY_EIO = 3. Your RFC RFC: Python parity for the Handler trait (Follow-up B) #43 reply on Q5 suggested letting audit-only handlers opt down to Errno(EIO); feat(ffi): C ABI for the Handler trait (RFC #43) #44 shipped only DENY_EPERM. EIO is the idiomatic choice for Python audit handlers (propagates as OSError rather than PermissionError). Discriminant appended after CONTINUE=2 — ABI-stable.

Python module sandlock.handler:

NotifAction — frozen dataclass mirroring sandlock_action_out_t; factory classmethods (continue_, errno, return_value_, hold, kill, inject_fd_send).
Handler — base class; subclass and override handle(ctx) -> NotifAction. Class attribute on_exception defaults to KILL (fail-closed, per Q5 = D).
HandlerCtx — read-only notification snapshot + read_cstr / read / write child-memory accessors.
ExceptionPolicy — IntEnum mirroring sandlock_exception_policy_t.

ctypes glue (_handler_ffi.py, _sdk.py):

A module-scope C-callable trampoline bridges the synchronous C callback to Handler.handle. Dispatch is via an integer-id registry lookup — no raw PyObject* crosses the FFI boundary.
The trampoline checks Py_IsInitialized() before touching Python state, catches handler exceptions (routing to the configured on_exception policy), rejects non-NotifAction return values, and translates the action into setter calls on sandlock_action_out_t.
The child-memory handle is wrapped in a liveness cell that the trampoline invalidates once the callback returns. A HandlerCtx that escapes its handle() call fails safe — read/read_cstr/write return None/False — rather than dereferencing the supervisor stack-local it pointed at.

Sandbox.run_with_handlers(cmd, handlers, name=None) — registers (syscall_nr, Handler) pairs and runs, mirroring the existing Sandbox.run mechanics.

Cross-cutting concerns (RFC #43)

The RFC listed four cross-cutting items. How each is handled:

GIL contention — each handle() dispatch holds the GIL; documented as a known limitation. Handlers should be fast and protect mutable state themselves.
Interpreter finalization — the trampoline's Py_IsInitialized() check returns an error if Py_FinalizeEx ran mid-dispatch, routing the notification through on_exception.
Native crashes inside handle() — not recoverable; documented as user responsibility.
Tokio runtime reentrancy — sandlock_run_with_handlers builds its own runtime; documented that it must not be called from within an existing Tokio runtime.

All four are written up in docs/extension-handlers.md under a new "Python wrapper" section, plus ownership rules for Handler instances and injected file descriptors.

Ownership

Handler instances are held by the Sandbox for the run's duration; the C container's ud_drop releases the Python reference on completion.
After sandlock_run_with_handlers returns, all handler-container pointers are owned by the supervisor — the Python side does not free them. The mid-loop allocation-failure path frees only the containers created before the failure (registry-entry-removed-exactly-once invariant, test-pinned).
The process-global handler registry is swept in a finally after sandlock_run_with_handlers returns: on the normal path the supervisor's ud_drop calls have already emptied the slots, but the run entry point is extern "C-unwind", so a panic (e.g. invoked from within an existing Tokio runtime) propagates as a Python exception — the unconditional, idempotent sweep keeps a panic from orphaning entries.

Test plan

Out of scope (PR 3)

Path-helper convenience methods (ctx.read_path()).
Preset/fixture handlers.
A dedicated docs page (this PR adds one section to extension-handlers.md).
Async handler wrappers — handlers stay synchronous per RFC Q1 = A.

Notes

One pre-existing test, test_sandbox.py::TestCpuThrottle::test_throttle_slows_execution, is flaky on loaded CI hosts (a CPU-time-ratio assertion). It fails identically on main without this PR — unrelated to the Handler wrapper.
Happy to split the DenyEio C-ABI discriminant into its own commit/PR if you'd prefer the Python wrapper PR to touch zero Rust.

congwang-mk

Nice work! The high-level looks pretty good, just some minor issues below.

congwang-mk · 2026-05-18T03:08:34Z

+sb = sandlock.Sandbox(fs_readable=["/usr", "/etc", "/lib", "/lib64", "/bin"])
+sb.run_with_handlers(
+    cmd=["/usr/bin/cat", "/etc/hostname"],
+    handlers=[(257, AuditOpens())],   # 257 = x86_64 SYS_openat


Could we avoid using raw syscall numbers? It is hard to find them out for different architecture. Maybe use string, like "openat" ?

Done. run_with_handlers now takes a syscall name (str) — handlers=[("openat", AuditOpens())] — resolved for the host arch, or a raw int for syscalls the resolver doesn't cover (e.g. getpid). Added sandlock_syscall_nr(const char *name) to the C ABI, wrapping the core's syscall_name_to_nr, so C callers get the same. Docs example updated. 9908444.

congwang-mk · 2026-05-18T03:09:30Z

+    global _NEXT_ID
+    with _REGISTRY_LOCK:
+        hid = _NEXT_ID
+        _NEXT_ID += 1


Is it an issue to always increase _NEXT_ID here?

Not an issue in practice. _NEXT_ID is a Python int (no fixed width — never overflows), and the registry is swept empty after every run, so only the counter advances, not memory. The one hard ceiling is the C ABI's ud slot — a pointer-width c_void_p (2**64 on 64-bit hosts) — far beyond any realistic process lifetime. A fresh id per registration is also the simplest guarantee that concurrently-live handlers have distinct uds; recycling would add concurrency-sensitive bookkeeping for a non-issue. Added a comment spelling this out in 9908444.

dzerik · 2026-05-18T14:04:05Z

Both review comments addressed in 9908444.

While testing I also fixed a pre-existing race in a1_ffi_handler_drains_inject_fd_on_panic (6991ab0) — it did a single non-blocking read for EOF, which races peer tests that fork() children holding a transient inherited duplicate of the pipe write end (O_CLOEXEC closes it only at exec). It now polls for POLLHUP with a deadline, so the assertion is about the drain rather than test scheduling. Unrelated to the Handler wrapper; happy to split it into its own PR if you'd prefer.

Local: 48 Rust tests + 35 Python handler tests pass.

Per the maintainer's RFC multikernel#43 response on Q5 (let audit-only handlers opt down to Errno(EIO) or Continue), add a fourth exception policy discriminant for EIO. Python audit handlers idiomatically prefer EIO because it propagates as OSError rather than PermissionError, which is closer to what callers expect from a failed syscall. Reserves discriminant 3 (after KILL=0, DENY_EPERM=1, CONTINUE=2 to preserve ABI stability with the merged Handler C ABI).

Discriminated dataclass mirroring sandlock_action_out_t. Constructed via classmethod factories. Discriminant values match the C-side SANDLOCK_ACTION_* constants 1:1 so the trampoline (a later task) can translate directly.

Default exception policy is KILL (fail-closed) per RFC multikernel#43 Q5 = D. Subclasses can override via class attribute. ExceptionPolicy enum discriminants are stable across the C ABI (KILL=0, DENY_EPERM=1, CONTINUE=2, DENY_EIO=3).

Read-only snapshot of the seccomp notification plus an opaque mem handle for child-memory access. The read_cstr/read/write methods short-circuit to a falsy result when no mem handle is present (test contexts); the real accessors are deferred to the ctypes glue module.

ctypes bindings for the merged Handler C ABI. The trampoline dispatches via an integer-id registry lookup, relies on ctypes' implicit GIL acquisition for CFUNCTYPE callbacks, checks Py_IsInitialized() defensively, catches handler exceptions (routing to the configured on_exception policy), and translates NotifAction into setter calls. Adds Sandbox.run_with_handlers and exports Handler/NotifAction/ HandlerCtx/ExceptionPolicy from the package root. Includes the RFC multikernel#43 audit smoke test counting SYS_openat interceptions.

Documents the four cross-cutting concerns from RFC multikernel#43: GIL contention, interpreter finalization, native crashes, and Tokio runtime reentrancy. Plus ownership rules for Handler instances and injected file descriptors, and a minimal copy-pasteable example.

The trampoline handed the raw sandlock_mem_handle_t* to HandlerCtx. That C struct is a stack local in the supervisor's spawn_blocking closure — it is freed the instant the callback returns. A handler that stored its HandlerCtx and called read/read_cstr/write afterwards dereferenced a dangling pointer (use-after-free). Wrap the handle in a mutable _MemHandle cell that the trampoline invalidates in a finally block. A retained HandlerCtx now fails safe (accessors return None/False) instead of dereferencing freed memory.

T1: run_with_handlers' registration-loop rollback freed only the containers already stored in the regs array. A container created by sandlock_handler_new but not yet stored (e.g. int(syscall_nr) raised in between) leaked. Track the pending container in a local and free it in the rollback path. Also: document that handle() may run concurrently for the same instance and must not block; reject a negative srcfd in NotifAction.inject_fd_send before it reaches the C setter; bind the run name bytes to a local for lifetime clarity; pin Py_IsInitialized's restype explicitly.

The deep review found the trampoline's NotifAction kind-dispatch (errno / return_value / kill / inject_fd) had zero end-to-end test coverage — a trampoline reduced to "always Continue" passed the whole suite. Add real-child integration tests: - errno action: child observes EPERM from a denied openat - return_value action: child's getpid returns the synthetic 777 - kill action returned directly: child terminated (on_exception set to Continue to prove the kill came from the action, not a policy fallback) - mem read_cstr: handler decodes the real openat path from child memory and denies a specific file - run_with_handlers argument validation: empty list runs cleanly, a non-Handler entry raises All four behavioural tests were destructively verified — neutering the trampoline's kind-dispatch makes each one fail.

A mutation-based audit of the handler smoke suite found six tests that passed even when the production code they claimed to verify was broken: - exception KILL/CONTINUE policy tests asserted only the run's pass/fail, not that the exception path was exercised or that the child was killed AT the intercepted syscall. Now the child prints before/after markers and the raising handler records that it ran. - the two enum/C-header "match" tests asserted hardcoded numbers against hardcoded numbers. Now they parse the discriminants out of sandlock.h so an ABI drift is caught. - the RFC audit smoke test counted interpreter-startup openat noise. Now it opens a unique probe file and counts only that path. - the HandlerCtx field-exposure test exercised _for_test, not the trampoline's notification unpacking. Now a real run inspects the fields the trampoline built. Each rewrite was destructively verified — the corresponding production mutation makes the test fail.

run_with_handlers inserts each Handler into a process-global registry and relies on the C ABI's ud_drop to remove the entry on completion. sandlock_run_with_handlers is extern "C-unwind", so a panic (e.g. invoked from within an existing Tokio runtime) propagates as a Python exception before the supervisor fires ud_drop — orphaning every entry. Wrap the call in try/finally and sweep the registered ids unconditionally; _unregister_handler is idempotent, so the sweep is a no-op once ud_drop has already run. Also from a self-review pass: - mem_read now fails (returns None) on a null handle regardless of length, mirroring mem_write — a dead context yields no child-memory access. A zero-length read on a live handle stays the trivial b"". - Drop the test-only HandlerCtx._for_test classmethod; _mem_handle already defaults to None, so tests construct HandlerCtx directly. - Add two registry-hygiene tests: a completed run leaves the registry empty, and a mid-loop registration failure rolls back with no orphaned entry. The rollback test is destructively verified.

The handler smoke tests hardcoded x86_64 syscall numbers (openat=257, getpid=39). On aarch64 those are wrong: the handler registers on an unrelated syscall and never fires, and 39 lands in the aarch64 deny list, which fails the run outright. All nine kernel-dependent tests failed on the ubuntu-24.04-arm CI runner while passing on x86_64. The Rust side resolves these via libc::SYS_*; Python has no such table, so add a small arch->number map (x86_64, aarch64) and a _syscall_nr helper that skips on an unmapped arch rather than silently registering on the wrong syscall.

Review feedback on the handler-registration API: raw syscall numbers are architecture-specific and hard to look up (openat is 257 on x86_64 but 56 on aarch64), a real footgun for the Python-facing API. Add `sandlock_syscall_nr(const char *name) -> int64_t` to the C ABI, a thin wrapper over the core's `syscall_name_to_nr`. It returns -1 for a NULL/invalid/unknown name. The resolvable set is the syscalls sandlock filters or supervises; syscalls outside it (e.g. getpid) return -1 and must be registered by raw number. `Sandbox.run_with_handlers` now accepts each handler key as a `str` syscall name (resolved via the new function for the host arch) or an `int` raw number. Resolution happens up front, before any native policy is built or handler container allocated, so an unknown name fails loudly and names the offending syscall. `_resolve_syscall` rejects `bool` keys explicitly: `bool` subclasses `int`, so True/False would otherwise resolve to syscalls 1/0 (write/read) — a silent wrong registration. Also documents why the trampoline's `_NEXT_ID` registration counter is monotonic-forever (no practical ceiling; the registry is swept empty after every run). Tests: 3 Rust tests for `sandlock_syscall_nr` (known/unknown/NULL); the Python handler tests register by name ("openat"); 6 new tests cover int passthrough, name resolution, and unknown-name / bad-type / bool rejection. The mid-loop rollback test now triggers via a non-Handler entry, since syscall resolution moved ahead of the registration loop.

dzerik · 2026-05-18T15:33:34Z

Rebased onto current main. Dropped my a1_ffi_handler_drains_inject_fd_on_panic race fix — your b1a30cd landed the same poll(2)-for-EOF fix, so it is redundant. The PR is now just the syscall-name resolution and the Python wrapper. 13 commits, CI green.

congwang-mk reviewed May 18, 2026

View reviewed changes

dzerik added 13 commits May 18, 2026 18:05

python: introduce NotifAction value-object

0f39713

Discriminated dataclass mirroring sandlock_action_out_t. Constructed via classmethod factories. Discriminant values match the C-side SANDLOCK_ACTION_* constants 1:1 so the trampoline (a later task) can translate directly.

dzerik force-pushed the follow-up-b-python-wrapper branch from 6991ab0 to 11d45df Compare May 18, 2026 15:12

congwang-mk merged commit 2b1b9c7 into multikernel:main May 20, 2026
8 checks passed

This was referenced May 21, 2026

feat(python): ergonomic layer — read_path + 4 presets + dedicated docs page #54

Open

Support for landlock ABI v5? #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(python): Handler wrapper on the C ABI (RFC #43)#46

feat(python): Handler wrapper on the C ABI (RFC #43)#46
congwang-mk merged 13 commits into
multikernel:mainfrom
dzerik:follow-up-b-python-wrapper

dzerik commented May 15, 2026

Uh oh!

congwang-mk left a comment

Uh oh!

congwang-mk May 18, 2026

Uh oh!

dzerik May 18, 2026

Uh oh!

congwang-mk May 18, 2026

Uh oh!

dzerik May 18, 2026

Uh oh!

dzerik commented May 18, 2026

Uh oh!

dzerik commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dzerik commented May 15, 2026

Summary

What's in this PR

Cross-cutting concerns (RFC #43)

Ownership

Test plan

Out of scope (PR 3)

Notes

Uh oh!

congwang-mk left a comment

Choose a reason for hiding this comment

Uh oh!

congwang-mk May 18, 2026

Choose a reason for hiding this comment

Uh oh!

dzerik May 18, 2026

Choose a reason for hiding this comment

Uh oh!

congwang-mk May 18, 2026

Choose a reason for hiding this comment

Uh oh!

dzerik May 18, 2026

Choose a reason for hiding this comment

Uh oh!

dzerik commented May 18, 2026

Uh oh!

dzerik commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants