feat(python): Handler wrapper on the C ABI (RFC #43)#46
Conversation
congwang-mk
left a comment
There was a problem hiding this comment.
Nice work! The high-level looks pretty good, just some minor issues below.
| sb = sandlock.Sandbox(fs_readable=["/usr", "/etc", "/lib", "/lib64", "/bin"]) | ||
| sb.run_with_handlers( | ||
| cmd=["/usr/bin/cat", "/etc/hostname"], | ||
| handlers=[(257, AuditOpens())], # 257 = x86_64 SYS_openat |
There was a problem hiding this comment.
Could we avoid using raw syscall numbers? It is hard to find them out for different architecture. Maybe use string, like "openat" ?
There was a problem hiding this comment.
Done. run_with_handlers now takes a syscall name (str) — handlers=[("openat", AuditOpens())] — resolved for the host arch, or a raw int for syscalls the resolver doesn't cover (e.g. getpid). Added sandlock_syscall_nr(const char *name) to the C ABI, wrapping the core's syscall_name_to_nr, so C callers get the same. Docs example updated. 9908444.
| global _NEXT_ID | ||
| with _REGISTRY_LOCK: | ||
| hid = _NEXT_ID | ||
| _NEXT_ID += 1 |
There was a problem hiding this comment.
Is it an issue to always increase _NEXT_ID here?
There was a problem hiding this comment.
Not an issue in practice. _NEXT_ID is a Python int (no fixed width — never overflows), and the registry is swept empty after every run, so only the counter advances, not memory. The one hard ceiling is the C ABI's ud slot — a pointer-width c_void_p (2**64 on 64-bit hosts) — far beyond any realistic process lifetime. A fresh id per registration is also the simplest guarantee that concurrently-live handlers have distinct uds; recycling would add concurrency-sensitive bookkeeping for a non-issue. Added a comment spelling this out in 9908444.
|
Both review comments addressed in While testing I also fixed a pre-existing race in Local: 48 Rust tests + 35 Python handler tests pass. |
Per the maintainer's RFC multikernel#43 response on Q5 (let audit-only handlers opt down to Errno(EIO) or Continue), add a fourth exception policy discriminant for EIO. Python audit handlers idiomatically prefer EIO because it propagates as OSError rather than PermissionError, which is closer to what callers expect from a failed syscall. Reserves discriminant 3 (after KILL=0, DENY_EPERM=1, CONTINUE=2 to preserve ABI stability with the merged Handler C ABI).
Discriminated dataclass mirroring sandlock_action_out_t. Constructed via classmethod factories. Discriminant values match the C-side SANDLOCK_ACTION_* constants 1:1 so the trampoline (a later task) can translate directly.
Default exception policy is KILL (fail-closed) per RFC multikernel#43 Q5 = D. Subclasses can override via class attribute. ExceptionPolicy enum discriminants are stable across the C ABI (KILL=0, DENY_EPERM=1, CONTINUE=2, DENY_EIO=3).
Read-only snapshot of the seccomp notification plus an opaque mem handle for child-memory access. The read_cstr/read/write methods short-circuit to a falsy result when no mem handle is present (test contexts); the real accessors are deferred to the ctypes glue module.
ctypes bindings for the merged Handler C ABI. The trampoline dispatches via an integer-id registry lookup, relies on ctypes' implicit GIL acquisition for CFUNCTYPE callbacks, checks Py_IsInitialized() defensively, catches handler exceptions (routing to the configured on_exception policy), and translates NotifAction into setter calls. Adds Sandbox.run_with_handlers and exports Handler/NotifAction/ HandlerCtx/ExceptionPolicy from the package root. Includes the RFC multikernel#43 audit smoke test counting SYS_openat interceptions.
Documents the four cross-cutting concerns from RFC multikernel#43: GIL contention, interpreter finalization, native crashes, and Tokio runtime reentrancy. Plus ownership rules for Handler instances and injected file descriptors, and a minimal copy-pasteable example.
The trampoline handed the raw sandlock_mem_handle_t* to HandlerCtx. That C struct is a stack local in the supervisor's spawn_blocking closure — it is freed the instant the callback returns. A handler that stored its HandlerCtx and called read/read_cstr/write afterwards dereferenced a dangling pointer (use-after-free). Wrap the handle in a mutable _MemHandle cell that the trampoline invalidates in a finally block. A retained HandlerCtx now fails safe (accessors return None/False) instead of dereferencing freed memory.
T1: run_with_handlers' registration-loop rollback freed only the containers already stored in the regs array. A container created by sandlock_handler_new but not yet stored (e.g. int(syscall_nr) raised in between) leaked. Track the pending container in a local and free it in the rollback path. Also: document that handle() may run concurrently for the same instance and must not block; reject a negative srcfd in NotifAction.inject_fd_send before it reaches the C setter; bind the run name bytes to a local for lifetime clarity; pin Py_IsInitialized's restype explicitly.
The deep review found the trampoline's NotifAction kind-dispatch (errno / return_value / kill / inject_fd) had zero end-to-end test coverage — a trampoline reduced to "always Continue" passed the whole suite. Add real-child integration tests: - errno action: child observes EPERM from a denied openat - return_value action: child's getpid returns the synthetic 777 - kill action returned directly: child terminated (on_exception set to Continue to prove the kill came from the action, not a policy fallback) - mem read_cstr: handler decodes the real openat path from child memory and denies a specific file - run_with_handlers argument validation: empty list runs cleanly, a non-Handler entry raises All four behavioural tests were destructively verified — neutering the trampoline's kind-dispatch makes each one fail.
A mutation-based audit of the handler smoke suite found six tests that passed even when the production code they claimed to verify was broken: - exception KILL/CONTINUE policy tests asserted only the run's pass/fail, not that the exception path was exercised or that the child was killed AT the intercepted syscall. Now the child prints before/after markers and the raising handler records that it ran. - the two enum/C-header "match" tests asserted hardcoded numbers against hardcoded numbers. Now they parse the discriminants out of sandlock.h so an ABI drift is caught. - the RFC audit smoke test counted interpreter-startup openat noise. Now it opens a unique probe file and counts only that path. - the HandlerCtx field-exposure test exercised _for_test, not the trampoline's notification unpacking. Now a real run inspects the fields the trampoline built. Each rewrite was destructively verified — the corresponding production mutation makes the test fail.
run_with_handlers inserts each Handler into a process-global registry and relies on the C ABI's ud_drop to remove the entry on completion. sandlock_run_with_handlers is extern "C-unwind", so a panic (e.g. invoked from within an existing Tokio runtime) propagates as a Python exception before the supervisor fires ud_drop — orphaning every entry. Wrap the call in try/finally and sweep the registered ids unconditionally; _unregister_handler is idempotent, so the sweep is a no-op once ud_drop has already run. Also from a self-review pass: - mem_read now fails (returns None) on a null handle regardless of length, mirroring mem_write — a dead context yields no child-memory access. A zero-length read on a live handle stays the trivial b"". - Drop the test-only HandlerCtx._for_test classmethod; _mem_handle already defaults to None, so tests construct HandlerCtx directly. - Add two registry-hygiene tests: a completed run leaves the registry empty, and a mid-loop registration failure rolls back with no orphaned entry. The rollback test is destructively verified.
The handler smoke tests hardcoded x86_64 syscall numbers (openat=257, getpid=39). On aarch64 those are wrong: the handler registers on an unrelated syscall and never fires, and 39 lands in the aarch64 deny list, which fails the run outright. All nine kernel-dependent tests failed on the ubuntu-24.04-arm CI runner while passing on x86_64. The Rust side resolves these via libc::SYS_*; Python has no such table, so add a small arch->number map (x86_64, aarch64) and a _syscall_nr helper that skips on an unmapped arch rather than silently registering on the wrong syscall.
Review feedback on the handler-registration API: raw syscall numbers
are architecture-specific and hard to look up (openat is 257 on x86_64
but 56 on aarch64), a real footgun for the Python-facing API.
Add `sandlock_syscall_nr(const char *name) -> int64_t` to the C ABI, a
thin wrapper over the core's `syscall_name_to_nr`. It returns -1 for a
NULL/invalid/unknown name. The resolvable set is the syscalls sandlock
filters or supervises; syscalls outside it (e.g. getpid) return -1 and
must be registered by raw number.
`Sandbox.run_with_handlers` now accepts each handler key as a `str`
syscall name (resolved via the new function for the host arch) or an
`int` raw number. Resolution happens up front, before any native
policy is built or handler container allocated, so an unknown name
fails loudly and names the offending syscall. `_resolve_syscall`
rejects `bool` keys explicitly: `bool` subclasses `int`, so True/False
would otherwise resolve to syscalls 1/0 (write/read) — a silent wrong
registration.
Also documents why the trampoline's `_NEXT_ID` registration counter is
monotonic-forever (no practical ceiling; the registry is swept empty
after every run).
Tests: 3 Rust tests for `sandlock_syscall_nr` (known/unknown/NULL); the
Python handler tests register by name ("openat"); 6 new tests cover
int passthrough, name resolution, and unknown-name / bad-type / bool
rejection. The mid-loop rollback test now triggers via a non-Handler
entry, since syscall resolution moved ahead of the registration loop.
6991ab0 to
11d45df
Compare
|
Rebased onto current |
Summary
Python wrapper for the
HandlerC ABI merged in #44. Python users can now write seccomp-notif handlers asHandlersubclasses without touching ctypes directly.Per RFC #43 phasing, this is the second of three PRs:
C ABI surface— merged in feat(ffi): C ABI for the Handler trait (RFC #43) #44.Handlerbase class, registration viaSandbox.run_with_handlers, audit smoke test.What's in this PR
C ABI extension — one new exception-policy discriminant:
SANDLOCK_EXCEPTION_DENY_EIO = 3. Your RFC RFC: Python parity for the Handler trait (Follow-up B) #43 reply on Q5 suggested letting audit-only handlers opt down toErrno(EIO); feat(ffi): C ABI for the Handler trait (RFC #43) #44 shipped onlyDENY_EPERM. EIO is the idiomatic choice for Python audit handlers (propagates asOSErrorrather thanPermissionError). Discriminant appended afterCONTINUE=2— ABI-stable.Python module
sandlock.handler:NotifAction— frozen dataclass mirroringsandlock_action_out_t; factory classmethods (continue_,errno,return_value_,hold,kill,inject_fd_send).Handler— base class; subclass and overridehandle(ctx) -> NotifAction. Class attributeon_exceptiondefaults toKILL(fail-closed, per Q5 = D).HandlerCtx— read-only notification snapshot +read_cstr/read/writechild-memory accessors.ExceptionPolicy— IntEnum mirroringsandlock_exception_policy_t.ctypes glue (
_handler_ffi.py,_sdk.py):Handler.handle. Dispatch is via an integer-id registry lookup — no rawPyObject*crosses the FFI boundary.Py_IsInitialized()before touching Python state, catches handler exceptions (routing to the configuredon_exceptionpolicy), rejects non-NotifActionreturn values, and translates the action into setter calls onsandlock_action_out_t.HandlerCtxthat escapes itshandle()call fails safe —read/read_cstr/writereturnNone/False— rather than dereferencing the supervisor stack-local it pointed at.Sandbox.run_with_handlers(cmd, handlers, name=None)— registers(syscall_nr, Handler)pairs and runs, mirroring the existingSandbox.runmechanics.Cross-cutting concerns (RFC #43)
The RFC listed four cross-cutting items. How each is handled:
handle()dispatch holds the GIL; documented as a known limitation. Handlers should be fast and protect mutable state themselves.Py_IsInitialized()check returns an error ifPy_FinalizeExran mid-dispatch, routing the notification throughon_exception.handle()— not recoverable; documented as user responsibility.sandlock_run_with_handlersbuilds its own runtime; documented that it must not be called from within an existing Tokio runtime.All four are written up in
docs/extension-handlers.mdunder a new "Python wrapper" section, plus ownership rules forHandlerinstances and injected file descriptors.Ownership
Handlerinstances are held by the Sandbox for the run's duration; the C container'sud_dropreleases the Python reference on completion.sandlock_run_with_handlersreturns, all handler-container pointers are owned by the supervisor — the Python side does not free them. The mid-loop allocation-failure path frees only the containers created before the failure (registry-entry-removed-exactly-once invariant, test-pinned).finallyaftersandlock_run_with_handlersreturns: on the normal path the supervisor'sud_dropcalls have already emptied the slots, but the run entry point isextern "C-unwind", so a panic (e.g. invoked from within an existing Tokio runtime) propagates as a Python exception — the unconditional, idempotent sweep keeps a panic from orphaning entries.Test plan
cargo test -p sandlock-ffi— 45 tests pass (44 + 1 newDenyEiopolicy test), 0 ignored.pytest python/tests/test_handler_smoke.py— 29 tests pass:NotifActionfactories + frozen-ness,ExceptionPolicy/Handlerdefaults & override,HandlerCtxfield exposure + frozen-ness, ABI-discriminant pins, handler-registry hygiene, and the end-to-end integration tests below.HandlercountingSYS_openatinterceptions on a real child process.errno(child observesEPERMfrom a deniedopenat),return_value(child'sgetpidreturns the synthetic value),killreturned directly by a handler (child terminated;on_exceptionset to Continue so the termination provably comes from the action, not a policy fallback). Each was destructively verified — neutering the trampoline's action dispatch makes each test fail.on_exception=CONTINUElets the child complete; withon_exception=KILLit terminates the child.HandlerCtxmemory accessor — a handler decodes the realopenatpath from child memory viaread_cstrand denies a specific file.HandlerCtxretained past itshandle()call has its memory accessors fail safe; verified the liveness cell is invalidated on the trampoline'sfinallypath.run_with_handlersargument validation — empty handler list runs cleanly; a non-Handlerentry raises.syscall_nron the second entry) rolls back with no orphaned registry entry. The rollback test is destructively verified — neutering the rollbackexceptblock leaves an orphaned entry and the test fails.pytest python/tests/test_sandbox.py— no regression from the_sdk.pyadditions.DenyEiodiscriminant.Out of scope (PR 3)
ctx.read_path()).extension-handlers.md).Notes
test_sandbox.py::TestCpuThrottle::test_throttle_slows_execution, is flaky on loaded CI hosts (a CPU-time-ratio assertion). It fails identically onmainwithout this PR — unrelated to the Handler wrapper.DenyEioC-ABI discriminant into its own commit/PR if you'd prefer the Python wrapper PR to touch zero Rust.