sandlock_spawn fails with ENOSYS (clone3) when called from a multi-threaded Python process (uvicorn/asyncio + Kubernetes RuntimeDefault seccomp)

## Summary

Calling `Sandbox(policy).run(...)` from a uvicorn server process returns `exit_code=-1, error=\"sandlock_spawn failed\"` every time. The identical call succeeds from a fresh single-threaded Python process in the same container.

## Context

I was setting up sandlock as the execution backend for an MCP tool server — following the recommendation in [lobehub/lobehub#12472](https://github.com/lobehub/lobehub/issues/12472) to use sandlock as a self-hosted alternative to LobeHub's cloud sandbox. Because LobeHub requires Streamable HTTP MCP transport (not SSE), I wrote a thin FastMCP wrapper around `Sandbox.run()`.

The server runs as a sidecar container in a Kubernetes k3s pod.

## Environment

- Python 3.12, sandlock 0.7.0 (pip)
- uvicorn + FastMCP (Streamable HTTP transport)
- Kubernetes k3s, kernel 6.18.18, Landlock ABI v7
- Pod seccomp: `RuntimeDefault` (Kubernetes PSS `restricted`)
- Container: UID 1000, `readOnlyRootFilesystem: true`, `allowPrivilegeEscalation: false`, `capabilities: drop ALL`

## Reproduction

Any FastMCP/uvicorn server that calls `Sandbox(policy).run()` from its request handler:

```python
@mcp.tool()
async def execute_python(code: str) -> str:
    ws = pathlib.Path("/tmp/sessions/default")
    policy = Policy(fs_readable=["/usr","/lib","/etc"], fs_writable=[str(ws)], ...)
    loop = asyncio.get_event_loop()
    return await loop.run_in_executor(None, lambda: Sandbox(policy).run(["python3", "-c", code]))
```

Result: `Result(success=False, exit_code=-1, error='sandlock_spawn failed')`

## Diagnosis

I am not a kernel developer or Python internals expert — I figured this out in collaboration with Claude Sonnet 4.6, so please correct any mistakes in the analysis.

A diagnostic endpoint injected into the running server process revealed:

```json
{
  "pid": 1,
  "active_threads": 2,
  "fork": "ok",
  "clone3": "ret=-1 errno=38 (Function not implemented)",
  "new_thread": "ok",
  "minimal_policy": {"ok": false, "error": "sandlock_spawn failed"}
}
```

Key observations:
- `fork()` works fine from the server process
- `clone3` returns `ENOSYS` — it is blocked by Kubernetes' `RuntimeDefault` seccomp profile
- Python's `threading.Thread` still works because glibc falls back from `clone3` to `clone`
- `sandlock_spawn` fails even with the most minimal policy

Reading `crates/sandlock-ffi/src/lib.rs`:

```rust
let rt = match tokio::runtime::Runtime::new() {   // = new_multi_thread()
    Ok(rt) => rt,
    Err(_) => return ptr::null_mut(),             // → "sandlock_spawn failed"
};
```

`Runtime::new()` calls `new_multi_thread()`, which spawns OS worker threads. Our hypothesis: when called from a **multi-threaded parent process** (uvicorn has 2 threads — event loop + thread pool), Tokio's worker thread spawning fails. Either `clone3` is blocked and the fallback doesn't work reliably in a multi-threaded context, or glibc's `pthread_atfork` handlers deadlock in the forked child. Python itself warns:

```
DeprecationWarning: This process (pid=1) is multi-threaded,
use of fork() may lead to deadlocks in the child.
```

The same issue exists in the current source at `lib.rs` lines ~694, ~744, ~890, ~1042, ~1224, ~1330, ~1628, ~1679, ~1710 and `handler/run.rs`.

## Workaround

Spawn a fresh single-threaded Python subprocess per sandlock call. The subprocess has no active event loop or thread pool, so Tokio's runtime creation succeeds:

```python
def _run_sandboxed_sync(cmd, ws, timeout):
    helper = r"""
import sys, json, pathlib
from sandlock import Sandbox, Policy
req = json.loads(sys.stdin.read())
# build policy, call Sandbox(policy).run(), return JSON
"""
    proc = subprocess.run(
        [sys.executable, "-c", helper],
        input=json.dumps({"cmd": cmd, "ws": str(ws), "timeout": timeout}),
        capture_output=True, text=True, timeout=timeout + 5,
    )
    return json.loads(proc.stdout)["output"]
```

This works, but adds ~50ms overhead (Python startup time) and an extra unconfined intermediary process.

## Suggested fix

Replace `Runtime::new()` with a current-thread runtime at every call site in the FFI layer:

```rust
// Before
let rt = match tokio::runtime::Runtime::new() {

// After
let rt = match tokio::runtime::Builder::new_current_thread()
    .enable_all()
    .build() {
```

A current-thread runtime runs entirely on the calling thread — no worker thread spawning, no `clone3`, no fork-safety issues. The async operations sandlock performs (waiting for child process I/O) are I/O-bound, not CPU-parallel, so there is no functional regression from dropping the multi-thread scheduler.

---

Happy to provide any additional diagnostic information or test a patched build.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sandlock_spawn fails with ENOSYS (clone3) when called from a multi-threaded Python process (uvicorn/asyncio + Kubernetes RuntimeDefault seccomp) #47

Summary

Context

Environment

Reproduction

Diagnosis

Workaround

Suggested fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

sandlock_spawn fails with ENOSYS (clone3) when called from a multi-threaded Python process (uvicorn/asyncio + Kubernetes RuntimeDefault seccomp) #47

Description

Summary

Context

Environment

Reproduction

Diagnosis

Workaround

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions