Skip to content

[codex] Add local and remote crawler execution modes#5

Open
mbelinky wants to merge 6 commits into
mainfrom
codex/remote-wacli-status-fixes
Open

[codex] Add local and remote crawler execution modes#5
mbelinky wants to merge 6 commits into
mainfrom
codex/remote-wacli-status-fixes

Conversation

@mbelinky
Copy link
Copy Markdown
Contributor

@mbelinky mbelinky commented May 23, 2026

Summary

  • add a manifest execution switch so connectors can run either locally or through local ssh
  • normalize built-in connector names: CrawlBar keeps stable IDs (gogcli, wacli, birdclaw) while using the real executables behind them
  • make Google, WhatsApp, and X available with Local/Remote settings, plus status/search/doctor commands where the underlying CLI supports them
  • keep wacli status green when the crawler reports state: current, while still exposing stale freshness metadata separately
  • harden remote run_as execution so commands enter the target user home before exec
  • add an X access-path switch: default bird, with birdclaw/xurl as an alternate authenticated path

Why

Some account archives and browser-authenticated tools live on a server, while others live on the Mac running CrawlBar. The connector should expose that as setup, not require separate one-off manifests or helper scripts.

Validation

  • swift run crawlbar-selftest
  • swift build
  • git diff --check
  • swift run crawlbarctl metadata --diagnostics
  • live smoke: crawlbarctl status --app discrawl --json reports state: current
  • live smoke: crawlbarctl status --app gitcrawl --json reports state: current
  • live smoke: external remote WhatsApp personal manifest reports state: current, message counts, and a remote store path redacted here
  • live smoke: external remote WhatsApp business manifest reports state: current, message counts, and a remote store path redacted here
  • live smoke: gog --no-input auth status --json returns parseable auth/config JSON and maps missing local auth to setup-needed status
  • live smoke: local bird check --plain maps to current X status through browser cookies
  • live smoke: local crawlbarctl query --app birdclaw -- from:belimad returns JSON search results through bird
  • live smoke: remote birdclaw auth status --json works for an authenticated xurl setup

Notes

  • Existing execution.kind = "ssh" manifests still work. New manifests can use execution.kind = "local" plus kind_config_id with local/remote choices.
  • The account-specific WhatsApp manifests used for local testing were also updated locally to use the new Local/Remote config shape, but those user-specific manifests are not part of this public PR.
  • The X access_path override is intentionally small handoff scaffolding. A cleaner follow-up is to move command/executable variants into the crawlkit manifest/control contract so CommandRunner does not need X-specific knowledge.

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

Codex review: needs real behavior proof before merge. Reviewed May 25, 2026, 4:13 PM ET / 20:13 UTC.

Summary
The PR adds local/SSH execution settings to crawler manifests, enables gog/wacli/birdclaw built-ins with commands and status mapping, adds self-test coverage, and documents a remote wacli manifest example.

Reproducibility: yes. from source inspection: set execution_mode to remote while binary_path points at a crawler executable, and the registry/runner still resolve that override before ssh. I did not run tests because this read-only review cannot run checks that create artifacts.

Review metrics: 2 noteworthy metrics.

  • Changed surface: 11 files changed, +946/-40. The PR touches core models, execution, registry, status mapping, UI copy, docs, examples, and self-tests, so compatibility review matters before merge.
  • Manifest API added: 1 execution object with 6 codable fields. The new manifest fields become a plugin/config contract that maintainers should approve before release.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🧂 unranked krab
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Fix or explicitly document the remote-mode binary override semantics and add regression coverage.
  • Add redacted terminal output, logs, screenshot, recording, or a linked artifact showing the real remote SSH crawler path after the fix.

Proof guidance:
Needs real behavior proof before merge: The PR body has validation claims but no inspectable redacted terminal output, logs, screenshot, recording, or linked artifact for the real remote SSH behavior; after adding proof, updating the PR body should trigger a fresh ClawSweeper review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • Remote mode can execute a saved crawler binary_path override instead of local ssh, so upgraded configs with a custom gog/wacli/birdclaw path can fail at runtime.
  • The new execution and kind_config_id fields become manifest/plugin API surface, so maintainers should settle executable override semantics before shipping them in built-ins and docs.
  • The PR body contains live-smoke claims, but there is still no inspectable redacted proof for the real SSH crawler path.

Maintainer options:

  1. Preserve crawler binary overrides (recommended)
    Keep binary_path as the crawler executable override and resolve local ssh through a separate path or default when remote mode is selected.
  2. Approve binary_path as the SSH override
    If maintainers want binary_path to mean the SSH executable in remote mode, update settings copy, docs, and tests so users do not mistake it for the crawler path.
  3. Pause the built-in rollout
    Leave the execution API out of newly available built-ins until upgrade behavior and real remote proof are settled.

Next step before merge
Needs contributor proof plus a maintainer decision on binary_path versus SSH executable semantics before automated repair or merge is safe.

Security
Cleared: No concrete security or supply-chain regression found; the diff adds SSH execution code but does not change dependencies, workflows, package resolution, downloaded artifacts, or secret permissions.

Review findings

  • [P1] Resolve remote mode through ssh despite binary overrides — Sources/CrawlBarCore/Registry.swift:43
Review details

Best possible solution:

Land after remote mode has stable executable-resolution semantics, focused upgrade coverage for binary overrides, maintainer-approved manifest API docs, and redacted real SSH behavior proof.

Do we have a high-confidence way to reproduce the issue?

Yes, from source inspection: set execution_mode to remote while binary_path points at a crawler executable, and the registry/runner still resolve that override before ssh. I did not run tests because this read-only review cannot run checks that create artifacts.

Is this the best way to solve the issue?

No. The direction is useful, but the PR should first settle whether binary_path is the crawler override or the SSH override in remote mode, then encode that contract in implementation, docs, and regression coverage.

Full review comments:

  • [P1] Resolve remote mode through ssh despite binary overrides — Sources/CrawlBarCore/Registry.swift:43
    In remote mode binary_path still wins over the ssh default, but the existing settings copy treats that field as the crawler CLI override. Switching a config with a saved wacli/gog/birdclaw path to remote would run that crawler executable with user@host '<remote command>' instead of local ssh, so upgraded configs fail.
    Confidence: 0.88

Overall correctness: patch is incorrect
Overall confidence: 0.88

AGENTS.md: not found in the target repository.

Codex review notes: model gpt-5.5, reasoning high; reviewed against da30a2a103b8.

Label changes

Label justifications:

  • P2: This is a normal-priority feature PR with a blocking compatibility issue but no shipped production regression yet.
  • merge-risk: 🚨 compatibility: The diff changes config and executable semantics in a way that can break users carrying an existing crawler binary override into remote mode.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🧂 unranked krab.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body has validation claims but no inspectable redacted terminal output, logs, screenshot, recording, or linked artifact for the real remote SSH behavior; after adding proof, updating the PR body should trigger a fresh ClawSweeper review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

What I checked:

  • Current main lacks central feature: Current main still marks gogcli, wacli, and birdclaw as coming-soon built-ins with empty command/capability sets, so the local/remote built-in activation is not already implemented on main. (Sources/CrawlBarCore/BuiltInCrawlApps.swift:187, da30a2a103b8)
  • PR adds new manifest execution API: The PR head adds ExecutionKind and Execution with local/ssh, target, run-as, env-file, and remote-binary fields on CrawlAppManifest, making this a new manifest/config contract. (Sources/CrawlBarCore/Models.swift:40, 8a10d3d27f3b)
  • PR docs promise ssh resolution: The added docs say remote mode resolves local ssh, then runs remote_binary on the remote host. (docs/control-protocol.md:165, 8a10d3d27f3b)
  • Registry still lets binary_path override ssh: On the PR head, remote mode computes an ssh default, but nativeAppConfig.binaryPath still wins before that default is used. (Sources/CrawlBarCore/Registry.swift:43, 8a10d3d27f3b)
  • Runner still executes binaryPath in remote mode: On the PR head, CrawlCommandRunner uses installation.binaryPath ?? "ssh" when execution is ssh, so a saved crawler path becomes the executable for SSH-shaped arguments. (Sources/CrawlBarCore/CommandRunner.swift:131, 8a10d3d27f3b)
  • Existing UI defines binary_path as crawler CLI override: Current main labels the setting Binary path override and says to leave it empty to resolve the CLI from PATH, so existing users can reasonably have crawler executable overrides saved before switching to remote mode. (Sources/CrawlBar/SettingsWindow.swift:1463, da30a2a103b8)

Likely related people:

  • mbelinky: GitHub commit metadata maps the current-main v0.2.2 PATH-resolution commit to this handle, and that commit owns the current binary resolution and settings behavior touched by this PR. (role: recent area contributor; confidence: high; commits: b98fb082bc72; files: Sources/CrawlBarCore/CommandRunner.swift, Sources/CrawlBarCore/Registry.swift, Sources/CrawlBar/SettingsWindow.swift)
  • vincentkoc: Repository history shows repeated prior work across CrawlBar command execution, registry/config, status mapping, and settings surfaces adjacent to the PR changes. (role: adjacent owner; confidence: medium; commits: be38b29946b8, 2583bde63860, b214c625575d; files: Sources/CrawlBarCore/CommandRunner.swift, Sources/CrawlBarCore/Registry.swift, Sources/CrawlBarCore/StatusMapper.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. labels May 23, 2026
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@mbelinky mbelinky changed the title [codex] Add remote wacli support and trust crawler status [codex] Add local and remote crawler execution modes May 23, 2026
@mbelinky mbelinky marked this pull request as ready for review May 23, 2026 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. P2 Normal priority bug or improvement with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant