feat: introduce pond (peer discovery + bridge + ssh-mesh)#129
Conversation
77ae15b to
3c11ada
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: be10a84cef
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
a897962 to
2ba9742
Compare
Adds peer discovery for crew members on delegated providers (E2B, Modal, Cloudflare, Railway, Tensorlake, Islo) by exposing each provider's native per-sandbox URL via a small adapter per provider. Each delegated lease's bridge URL is surfaced as a BridgePeerTarget in `crabbox crew peers --crew <name> --json`. The crew foundation in PR openclaw#129 already provides the label and selector; this PR is the network plane for delegated providers, mirroring what the Tailscale plane does for managed Linux. Implementation: - `core.BridgeProvider` interface in internal/cli/crew_bridge.go; resolveCrewPeers fans out across every provider represented in the crew when `--provider` is omitted, single-provider semantics when it is explicit. - Islo adapter publishes per-port HTTPS URLs through the native islo `share` API, reusing existing shares so calls are idempotent. - E2B adapter synthesises the canonical `https://<port>-<sandboxID>.<domain>` preview URL from the existing sandbox + config — no new lease fields, no extra round-trip. - Railway adapter surfaces the existing `railwayDeployment.URL` field; one URL per service (no per-port routing). - Modal, Cloudflare, Tensorlake register explicit "unsupported" adapters that return core.ErrBridgeNotImplemented. The resolver tags those peers with `BridgeState=unsupported` so callers see the gap honestly instead of mistaking it for "no shares yet". - Blacksmith is intentionally not part of the bridge plane (owns its own connectivity). Honest scope: HTTP-only peer dial via provider-native ingress URLs — one target per port, no DNS aliasing, no relay component. Non-HTTP protocols stay on the Tailscale plane. Live-validated against two real islo.dev sandboxes — one serving HTTP, the other dialing it via `crabbox crew peers`. Stacked on PR openclaw#129; merge after the foundation lands.
Adds peer discovery for crew members on delegated providers (E2B, Modal, Cloudflare, Railway, Tensorlake, Islo) by exposing each provider's native per-sandbox URL via a small adapter per provider. Each delegated lease's bridge URL is surfaced as a BridgePeerTarget in `crabbox crew peers --crew <name> --json`. The crew foundation in PR plane for delegated providers, mirroring what the Tailscale plane does for managed Linux. Implementation: - `core.BridgeProvider` interface in internal/cli/crew_bridge.go; resolveCrewPeers fans out across every provider represented in the crew when `--provider` is omitted, single-provider semantics when it is explicit. - Islo adapter publishes per-port HTTPS URLs through the native islo `share` API, reusing existing shares so calls are idempotent. - E2B adapter synthesises the canonical `https://<port>-<sandboxID>.<domain>` preview URL from the existing sandbox + config — no new lease fields, no extra round-trip. - Railway adapter surfaces the existing `railwayDeployment.URL` field; one URL per service (no per-port routing). - Modal, Cloudflare, Tensorlake register explicit "unsupported" adapters that return core.ErrBridgeNotImplemented. The resolver tags those peers with `BridgeState=unsupported` so callers see the gap honestly instead of mistaking it for "no shares yet". - Blacksmith is intentionally not part of the bridge plane (owns its own connectivity). Honest scope: HTTP-only peer dial via provider-native ingress URLs — one target per port, no DNS aliasing, no relay component. Non-HTTP protocols stay on the Tailscale plane. Live-validated against two real islo.dev sandboxes — one serving HTTP, the other dialing it via `crabbox crew peers`. Stacked on PR openclaw#129; merge after the foundation lands.
Adds peer discovery across the full crew, regardless of provider: - Managed-Linux peers (Tailscale plane): endpoint=tailnet IP - SSH-lease peers: endpoint=ssh://host:port - Delegated-with-URL peers (E2B, Modal, Cloudflare, Railway, Islo, Tensorlake): endpoint=per-sandbox public URL - Blacksmith / no-adapter providers: surfaced as transport=none so doctor reports honestly `crabbox crew peers --crew <name> --json` returns the unified listing. `crabbox doctor --crew <name>` includes the reachability matrix per transport pair so users see the asymmetry: tailnet->url works one-way, url->tailnet doesn't, ssh-pairs need operator-side bridging (see the SSH-mesh DRAFT PR). Stacked on openclaw#129; merge after the foundation lands.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2ba974204f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2ba974204f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Thanks for the PR! I'm not fully sold yet, mostly because I don't have that usecase. What is your use case for crew? |
Adds the crew primitive: --crew flag, reserved crew= provider label, Tailscale ACL tag with self-bootstrap (GET-merge-PUT-with-ETag), and a cloud-init systemd timer that rewrites /etc/hosts.cbx from tailscale status --json. Doctor sub-check, list filter, label propagation through the lease path. (Squashed from fork/feat/crew-labels for consolidation into PR openclaw#129.)
Adds BridgeProvider interface, three live URL adapters (Islo idempotent share POST, E2B synthesized preview, Railway deployment URL), and stub adapters returning BridgeState=unsupported for Modal/CF/Tensorlake. Introduces `crabbox crew peers --crew <name>` returning a per-peer transport hint (tailnet/url/ssh/pending/unsupported/none) and an honest endpoint, plus `crabbox doctor --crew` cross-plane reachability matrix. (Squashed from fork/feat/crew-bridge-plane for consolidation into PR openclaw#129.)
|
@codex review |
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 90de6c658f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 62e5098c1c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Add the pond grouping primitive with --pond, reserved pond labels, Tailscale ACL bootstrap, bridge-provider peer discovery, SSH-mesh forwarding, pond release, and doctor/list/status integration. Persist pond and exposed-port metadata through local claims, coordinator requests, Worker lease records, and provider labels. Add bridge adapters for URL-capable providers and capability-based transport selection. Include focused docs and regression coverage for ACL parsing, typed list filtering, claim persistence, bridge discovery, reachability matrix rendering, and SSH-mesh tunnel setup.
|
@codex review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
@codex review |
|
@clawsweeper re-review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 092ed73ad2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
collectPondMembersAcrossProviders used leaseOptionsFromConfig which carries Class (beast by default) and other config filters, silently excluding pond members created with different classes/profiles. Use pond-only LeaseOptions so every claim-matched member is visible regardless of its creation config. (Codex P1)
|
@codex review |
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f157fcfc35
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…h, partial peers - doctor_pond.go: rewrite normalizeHuJSON comment-skip to match hujsonStandardize pattern: advance i past comment and continue without break, handling EOF correctly (no data loss when // is the last line). Remove stale duplicate comment. - pool.go: fix extractLabelMapFromStruct fallthrough when Labels exists but is empty — return nil instead of falling through to Pond field fallback. Pond field is only a fallback when no Labels field exists at all. - pond_bridge.go: don't abort all pond peers when one provider fails. Collect partial results and return first error only when ALL providers fail. Cross-provider ponds should be resilient to single-provider misconfiguration.
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
- pond_bridge.go: set peer.Transport=TransportNone when bridge returns ErrBridgeNotImplemented (both PublishPeer and ListPeerTargets branches). Previously Transport stayed "url" while BridgeState was "unsupported" — the contradictory pair confused consumers. - docs/commands/pond.md: correct transport table — AWS/Proxmox/Static SSH are SSH-only, not tailnet. Only Hetzner/Azure/GCP have Tailscale. Split provider table accordingly. Add pond connect documentation (flags, --export, hosts file, eval usage).
Co-authored-by: Cursor <cursoragent@cursor.com>
Align pond docs and metadata handling around explicit transport planes so crabbox.sh, local claims, and CLI surfaces do not overpromise reachability or infer missing state. Co-authored-by: Cursor <cursoragent@cursor.com>
|
I believe its use case is less about “mesh networking” and more about “temporary multi-machine sandbox group.” For instance, consider testing a RAG/vector DB system. The embedder requires a GPU box on Modal, while the vector DB and producers run on Linux. Clients need to be tested from both Linux and Windows. In this scenario, pond rag-pr-123 assigns all those leases a shared name. Pond peers provide the test runner with the reachable endpoints and capability hints. Finally, pond release cleans up the entire environment. The promise here is to create heterogeneous ephemeral test environments. Each role has the right machine, and they can be grouped, discoverable, and cleanup-able through crabbox. The transport mechanisms are just best-effort implementation details. Another concrete use case is at Incredibuild, where we recognize the value of accelerating processes across a few helper machines over the network. Having a temporary helper-machine pool via crabbox.sh would be beneficial. |
Summary
--pond NAMEgroups leases across providers with reservedpondmetadata.crabbox pond peers,crabbox pond connect [--export],crabbox pond release, anddoctor --pond.Verification
go test ./...go vet ./...(cd worker && npm test)(cd worker && npm run check)--export.Notes
pondis preview surface for v0.x.Closes #136.
Closes #137.