Add multi-core serving via SO_REUSEPORT fork-based workers#31
Conversation
Implement App.serve-with-workers that forks n child processes, each running its own event loop bound to the same host:port. The kernel distributes incoming connections across workers via SO_REUSEPORT (already set on TcpListener). Falls back to single-process serve when n <= 1. Add (workers N) form to defserver for declarative configuration. Addresses issue #7.
There was a problem hiding this comment.
Build & Tests
CI: PASS — macos-latest passes (tests + doc generation). The existing test suite (test/web.carp, test/websocket.carp) passes.
The Carp compiler is not available on the review machine (armhf Pi), so the code could not be built or tested locally. Review is based on code reading, CI output, and verification against the Carp stdlib and sockets library source.
No tests exist specifically for serve-with-workers — the function compiles (verified via CI building the test imports), but fork-based behavior isn't exercised. This is understandable given the difficulty of testing fork/signal behavior in Carp's test framework.
Findings
1. Design is sound
Fork-based isolation is the right call for Carp — no shared mutable state, no closure-capture issues, no threading complexity. Each worker is an independent process with its own ConnState, poll set, and event loop.
Verified that SO_REUSEPORT is set in sockets/src/tcp_listener.h (guarded by #ifdef SO_REUSEPORT), so multiple processes binding to the same port is supported at the socket level. The kernel handles load balancing of accept() calls across workers.
2. API usage is correct
Verified against Carp stdlib source (core/System.carp):
System.forkreturnsInt— code correctly checks< 0(error),= 0(child), else (parent)System.waittakes(Ptr Int)—(Pointer.address &status)is the correct pattern (matches stdlib's ownTest.carpusage atcore/Test.carp:106)System.signaltakesIntand(Fn [Int] ())— handler signatures matchSystem.exittakesInt—(System.exit 0)afterservereturns ensures clean child exit
3. Signal handling is correct with a minor note
The parent ignores SIGINT/SIGTERM after all forks complete. Children inherit the pre-fork signal state (default), then serve installs its own handler ((set! App.running false)). When the user presses Ctrl-C, children exit cleanly via the event loop, and the parent unblocks from System.wait.
Minor note: there's a small race window between fork completion and the parent's (System.signal ...) call. If SIGINT arrives in that window, the parent would exit before ignoring signals. In practice this is negligible (the window is a few instructions wide), but worth knowing about.
4. Edge cases handled
n <= 1: falls back to single-processserve. Correct.- Partial fork failure: prints error, continues.
spawnedonly counts successful forks, soSystem.waitloops the right number of times. No zombie processes. - All forks fail:
spawnedstays 0, prints "No workers spawned; exiting". Correct. (workers 0)or(workers -1)indefserver:worker-countwould be 0 or -1, condition(> worker-count 0)is false, falls back toApp.serve. Thenserve-with-workersalso checks(<= n 1). Double-safe.
5. defserver macro changes are clean
The worker-count extraction correctly:
- Filters
(workers N)forms from the body - Extracts N from the first one (or defaults to 0)
- Removes worker forms before route/setup separation (so
workersdoesn't leak into setup expressions) - Generates either
App.serve-with-workersorApp.servebased on worker-count
The existing route-form? predicate doesn't include workers, and it doesn't need to — worker forms are filtered out upstream.
6. CHANGELOG updated
The CHANGELOG entry under "Unreleased → Added" accurately describes the feature.
7. Documentation updated
The serve doc string now references serve-with-workers instead of suggesting a TCP load balancer. The defserver doc includes the (workers N) form. Both are accurate.
Verdict: merge
Well-designed feature that fits naturally into the existing architecture. Correct use of system APIs (verified against stdlib source). Edge cases handled. CHANGELOG and docs updated. CI passes. No issues found.
Summary
Addresses #7.
App.serve-with-workersthat forksnchild processes, each running its own event loop on the samehost:portSO_REUSEPORT(already set onTcpListenerin the sockets library)servewhenn <= 1SIGINT/SIGTERM(children handle their own signals), then waits for all children to exit(workers N)form todefserverfor declarative configuration:Design
Fork-based isolation avoids all shared-mutable-state concerns — each worker is an independent process with its own
ConnState, poll set, and event loop. No threads, no locks, no closure-capture issues. TheSO_REUSEPORTsocket option (Linux 3.9+, most BSDs) allows multiple processes to bind to the same port; the kernel load-balancesaccept()calls across them.Test plan
defserverwith(workers 4)compiles and spawns 4 workersdefserverwithout(workers N)still callsserve(single-process)serve-with-workerswithn=1falls back toserve