Skip to content

feat: introduce pond (peer discovery + bridge + ssh-mesh)#129

Open
zozo123 wants to merge 6 commits into
openclaw:mainfrom
zozo123:feat/crew-labels
Open

feat: introduce pond (peer discovery + bridge + ssh-mesh)#129
zozo123 wants to merge 6 commits into
openclaw:mainfrom
zozo123:feat/crew-labels

Conversation

@zozo123
Copy link
Copy Markdown
Contributor

@zozo123 zozo123 commented May 18, 2026

Summary

  • Introduces ponds: --pond NAME groups leases across providers with reserved pond metadata.
  • Adds crabbox pond peers, crabbox pond connect [--export], crabbox pond release, and doctor --pond.
  • Supports three transport planes from provider capabilities: Tailscale, URL bridge, and SSH-mesh.
  • Persists pond and exposed-port metadata through local claims, coordinator requests, Worker lease records, and provider labels.
  • Adds bridge adapters for Islo, E2B, Railway, Modal, Cloudflare, and Tensorlake, with unsupported states where provider ingress is not available.
  • Keeps existing single-box flows unchanged.

Verification

  • go test ./...
  • go vet ./...
  • (cd worker && npm test)
  • (cd worker && npm run check)
  • Live proof captured during the PR: Islo bridge peer discovery, Tailscale ACL bootstrap/doctor, and local-container SSH-mesh --export.

Notes

  • pond is preview surface for v0.x.
  • Tailscale ACL bootstrap only runs client-side with operator credentials; the broker does not receive Tailscale API credentials.
  • Default pond ACL allows members in the same pond to talk to each other. Finer per-member isolation is left out of this PR.

Closes #136.
Closes #137.

@zozo123 zozo123 force-pushed the feat/crew-labels branch from 39fc0e6 to 622edbf Compare May 18, 2026 15:15
@zozo123 zozo123 changed the title feat: introduce crew — co-located leases discover each other (RFC / draft) feat: introduce crew — co-located leases discover each other May 18, 2026
@zozo123 zozo123 marked this pull request as ready for review May 18, 2026 15:16
@zozo123 zozo123 force-pushed the feat/crew-labels branch 5 times, most recently from 77ae15b to 3c11ada Compare May 18, 2026 15:53
@zozo123 zozo123 changed the title feat: introduce crew — co-located leases discover each other feat: introduce crew peer discovery May 19, 2026
@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 20, 2026

@codex

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be10a84cef

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/cli/pool.go Outdated
Comment thread internal/cli/pool.go Outdated
@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 21, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@zozo123 zozo123 force-pushed the feat/crew-labels branch 4 times, most recently from a897962 to 2ba9742 Compare May 21, 2026 09:28
zozo123 added a commit to zozo123/crabbox that referenced this pull request May 21, 2026
Adds peer discovery for crew members on delegated providers (E2B,
Modal, Cloudflare, Railway, Tensorlake, Islo) by exposing each
provider's native per-sandbox URL via a small adapter per provider.

Each delegated lease's bridge URL is surfaced as a BridgePeerTarget in
`crabbox crew peers --crew <name> --json`. The crew foundation in PR
openclaw#129 already provides the label and selector; this PR is the network
plane for delegated providers, mirroring what the Tailscale plane does
for managed Linux.

Implementation:

- `core.BridgeProvider` interface in internal/cli/crew_bridge.go;
  resolveCrewPeers fans out across every provider represented in the
  crew when `--provider` is omitted, single-provider semantics when
  it is explicit.
- Islo adapter publishes per-port HTTPS URLs through the native islo
  `share` API, reusing existing shares so calls are idempotent.
- E2B adapter synthesises the canonical
  `https://<port>-<sandboxID>.<domain>` preview URL from the
  existing sandbox + config — no new lease fields, no extra round-trip.
- Railway adapter surfaces the existing `railwayDeployment.URL`
  field; one URL per service (no per-port routing).
- Modal, Cloudflare, Tensorlake register explicit "unsupported"
  adapters that return core.ErrBridgeNotImplemented. The resolver
  tags those peers with `BridgeState=unsupported` so callers see
  the gap honestly instead of mistaking it for "no shares yet".
- Blacksmith is intentionally not part of the bridge plane (owns its
  own connectivity).

Honest scope: HTTP-only peer dial via provider-native ingress URLs —
one target per port, no DNS aliasing, no relay component. Non-HTTP
protocols stay on the Tailscale plane.

Live-validated against two real islo.dev sandboxes — one serving
HTTP, the other dialing it via `crabbox crew peers`.

Stacked on PR openclaw#129; merge after the foundation lands.
zozo123 added a commit to zozo123/crabbox that referenced this pull request May 21, 2026
Adds peer discovery for crew members on delegated providers (E2B,
Modal, Cloudflare, Railway, Tensorlake, Islo) by exposing each
provider's native per-sandbox URL via a small adapter per provider.

Each delegated lease's bridge URL is surfaced as a BridgePeerTarget in
`crabbox crew peers --crew <name> --json`. The crew foundation in PR
plane for delegated providers, mirroring what the Tailscale plane does
for managed Linux.

Implementation:

- `core.BridgeProvider` interface in internal/cli/crew_bridge.go;
  resolveCrewPeers fans out across every provider represented in the
  crew when `--provider` is omitted, single-provider semantics when
  it is explicit.
- Islo adapter publishes per-port HTTPS URLs through the native islo
  `share` API, reusing existing shares so calls are idempotent.
- E2B adapter synthesises the canonical
  `https://<port>-<sandboxID>.<domain>` preview URL from the
  existing sandbox + config — no new lease fields, no extra round-trip.
- Railway adapter surfaces the existing `railwayDeployment.URL`
  field; one URL per service (no per-port routing).
- Modal, Cloudflare, Tensorlake register explicit "unsupported"
  adapters that return core.ErrBridgeNotImplemented. The resolver
  tags those peers with `BridgeState=unsupported` so callers see
  the gap honestly instead of mistaking it for "no shares yet".
- Blacksmith is intentionally not part of the bridge plane (owns its
  own connectivity).

Honest scope: HTTP-only peer dial via provider-native ingress URLs —
one target per port, no DNS aliasing, no relay component. Non-HTTP
protocols stay on the Tailscale plane.

Live-validated against two real islo.dev sandboxes — one serving
HTTP, the other dialing it via `crabbox crew peers`.

Stacked on PR openclaw#129; merge after the foundation lands.
zozo123 added a commit to zozo123/crabbox that referenced this pull request May 21, 2026
Adds peer discovery across the full crew, regardless of provider:

- Managed-Linux peers (Tailscale plane): endpoint=tailnet IP
- SSH-lease peers: endpoint=ssh://host:port
- Delegated-with-URL peers (E2B, Modal, Cloudflare, Railway, Islo,
  Tensorlake): endpoint=per-sandbox public URL
- Blacksmith / no-adapter providers: surfaced as transport=none
  so doctor reports honestly

`crabbox crew peers --crew <name> --json` returns the unified
listing. `crabbox doctor --crew <name>` includes the reachability
matrix per transport pair so users see the asymmetry: tailnet->url
works one-way, url->tailnet doesn't, ssh-pairs need operator-side
bridging (see the SSH-mesh DRAFT PR).

Stacked on openclaw#129; merge after the foundation lands.
@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 21, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2ba974204f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/cli/doctor_crew.go Outdated
Comment thread internal/cli/pool.go Outdated
@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 21, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2ba974204f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/cli/doctor_crew.go Outdated
Comment thread internal/cli/lease_flags.go Outdated
@steipete
Copy link
Copy Markdown
Contributor

Thanks for the PR! I'm not fully sold yet, mostly because I don't have that usecase. What is your use case for crew?

zozo123 added a commit to zozo123/crabbox that referenced this pull request May 22, 2026
Adds the crew primitive: --crew flag, reserved crew= provider label,
Tailscale ACL tag with self-bootstrap (GET-merge-PUT-with-ETag), and a
cloud-init systemd timer that rewrites /etc/hosts.cbx from
tailscale status --json. Doctor sub-check, list filter, label
propagation through the lease path.

(Squashed from fork/feat/crew-labels for consolidation into PR openclaw#129.)
zozo123 added a commit to zozo123/crabbox that referenced this pull request May 22, 2026
Adds BridgeProvider interface, three live URL adapters (Islo idempotent
share POST, E2B synthesized preview, Railway deployment URL), and stub
adapters returning BridgeState=unsupported for Modal/CF/Tensorlake.

Introduces `crabbox crew peers --crew <name>` returning a per-peer
transport hint (tailnet/url/ssh/pending/unsupported/none) and an honest
endpoint, plus `crabbox doctor --crew` cross-plane reachability matrix.

(Squashed from fork/feat/crew-bridge-plane for consolidation into PR openclaw#129.)
@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 22, 2026

@codex review

@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 22, 2026

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 22, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 90de6c658f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/cli/pond_mesh.go Outdated
@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 23, 2026

@codex review

@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 23, 2026

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 62e5098c1c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/cli/pond_mesh.go Outdated
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. labels May 23, 2026
@zozo123 zozo123 force-pushed the feat/crew-labels branch from 62e5098 to eb4fba7 Compare May 23, 2026 05:52
Add the pond grouping primitive with --pond, reserved pond labels, Tailscale ACL bootstrap, bridge-provider peer discovery, SSH-mesh forwarding, pond release, and doctor/list/status integration.

Persist pond and exposed-port metadata through local claims, coordinator requests, Worker lease records, and provider labels. Add bridge adapters for URL-capable providers and capability-based transport selection.

Include focused docs and regression coverage for ACL parsing, typed list filtering, claim persistence, bridge discovery, reachability matrix rendering, and SSH-mesh tunnel setup.
@zozo123 zozo123 force-pushed the feat/crew-labels branch from eb4fba7 to 092ed73 Compare May 23, 2026 05:55
@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 23, 2026

@codex review
@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 23, 2026

@codex review

@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 23, 2026

@clawsweeper re-review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 092ed73ad2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/cli/pond_mesh.go Outdated
collectPondMembersAcrossProviders used leaseOptionsFromConfig which
carries Class (beast by default) and other config filters, silently
excluding pond members created with different classes/profiles.
Use pond-only LeaseOptions so every claim-matched member is visible
regardless of its creation config. (Codex P1)
@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 23, 2026

@codex review

@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 23, 2026

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f157fcfc35

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/cli/pond_bridge.go
Comment thread internal/cli/coordinator.go
…h, partial peers

- doctor_pond.go: rewrite normalizeHuJSON comment-skip to match
  hujsonStandardize pattern: advance i past comment and continue
  without break, handling EOF correctly (no data loss when // is the
  last line). Remove stale duplicate comment.

- pool.go: fix extractLabelMapFromStruct fallthrough when Labels
  exists but is empty — return nil instead of falling through to
  Pond field fallback. Pond field is only a fallback when no Labels
  field exists at all.

- pond_bridge.go: don't abort all pond peers when one provider fails.
  Collect partial results and return first error only when ALL
  providers fail. Cross-provider ponds should be resilient to
  single-provider misconfiguration.
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

- pond_bridge.go: set peer.Transport=TransportNone when bridge returns
  ErrBridgeNotImplemented (both PublishPeer and ListPeerTargets
  branches). Previously Transport stayed "url" while BridgeState
  was "unsupported" — the contradictory pair confused consumers.

- docs/commands/pond.md: correct transport table — AWS/Proxmox/Static
  SSH are SSH-only, not tailnet. Only Hetzner/Azure/GCP have Tailscale.
  Split provider table accordingly. Add pond connect documentation
  (flags, --export, hosts file, eval usage).
@zozo123 zozo123 force-pushed the feat/crew-labels branch from 2f91c97 to bcfd09b Compare May 23, 2026 08:00
zozo123 and others added 2 commits May 24, 2026 11:35
Co-authored-by: Cursor <cursoragent@cursor.com>
Align pond docs and metadata handling around explicit transport planes so crabbox.sh, local claims, and CLI surfaces do not overpromise reachability or infer missing state.

Co-authored-by: Cursor <cursoragent@cursor.com>
@zozo123
Copy link
Copy Markdown
Contributor Author

zozo123 commented May 25, 2026

I believe its use case is less about “mesh networking” and more about “temporary multi-machine sandbox group.”

For instance, consider testing a RAG/vector DB system. The embedder requires a GPU box on Modal, while the vector DB and producers run on Linux. Clients need to be tested from both Linux and Windows. In this scenario, pond rag-pr-123 assigns all those leases a shared name. Pond peers provide the test runner with the reachable endpoints and capability hints. Finally, pond release cleans up the entire environment.

The promise here is to create heterogeneous ephemeral test environments. Each role has the right machine, and they can be grouped, discoverable, and cleanup-able through crabbox. The transport mechanisms are just best-effort implementation details. Another concrete use case is at Incredibuild, where we recognize the value of accelerating processes across a few helper machines over the network. Having a temporary helper-machine pool via crabbox.sh would be beneficial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. P2 Normal priority bug or improvement with limited blast radius. proof: sufficient Contributor real behavior proof is sufficient. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants