Skip to content

fix: apply routing config on proxy reuse path (defense-in-depth)#149

Closed
kagura-agent wants to merge 6 commits intoBlockRunAI:mainfrom
kagura-agent:fix/routing-config-ignored
Closed

fix: apply routing config on proxy reuse path (defense-in-depth)#149
kagura-agent wants to merge 6 commits intoBlockRunAI:mainfrom
kagura-agent:fix/routing-config-ignored

Conversation

@kagura-agent
Copy link
Copy Markdown
Contributor

@kagura-agent kagura-agent commented Apr 10, 2026

Summary

Defense-in-depth fix for the startProxy() reuse path: when a second register() call hits an existing proxy via checkExistingProxy(), routingConfig was silently dropped because mergeRoutingConfig was never called on that path.

Note: This is a different bug than #147 (which was fixed in v0.12.142 via the single-process plugin path). This PR addresses the multi-process reuse scenario where startProxy() returns an existing proxy URL.

What changed

  1. New /__update-routing internal endpoint on the proxy server — accepts POST with routing config and calls mergeRoutingConfig on the running instance
  2. startProxy() reuse path sends routing config to the running proxy when routingConfig is present
  3. New /__routing-config debug endpoint for inspection
  4. Returns { updated: false } when no routingConfig provided (callers can distinguish no-op vs actual update)

Limitation

Config applied this way is additive — it cannot remove overrides set by the first caller. This is inherent to the reuse model.

Tests

  • 2 new tests for the reuse-path scenario
  • All 410 existing tests pass
  • TypeCheck clean

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds two control-plane endpoints (POST /__update-routing, GET /__routing-config) to mutate and inspect the in-memory routing config, and changes startProxy() reuse behavior to POST a provided routingConfig to an already-running proxy. New tests validate merged overrides, preservation of defaults, and reuse-as-no-op closing. (50 words)

Changes

Cohort / File(s) Summary
Proxy runtime & control endpoints
src/proxy.ts
Adds POST /__update-routing to accept { routingConfig?: Partial<RoutingConfig> } and merge into the in-memory routerOpts.config via mergeRoutingConfig; adds GET /__routing-config to return current config; startProxy() reuse path issues POST /__update-routing when options.routingConfig is provided and logs on failure.
Reuse behavior tests
src/proxy.routing-config-reuse.test.ts
Adds Vitest tests that start a proxy, call startProxy() again on the same port with and without routingConfig, immediately close the second proxy (treated as no-op), and assert GET /__routing-config reflects merged overrides or preserves defaults; uses try/finally for cleanup.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Caller / startProxy
    participant Proxy as Proxy Server
    participant Router as In-memory Router Config

    Caller->>Proxy: startProxy(port, routingConfig?)
    alt proxy already running (reuse)
        Caller->>Proxy: POST /__update-routing { routingConfig }
        Proxy->>Proxy: parse JSON body
        Proxy->>Router: mergeRoutingConfig(routingConfig) (mutate in-memory)
        Proxy-->>Caller: 200 { updated: true } / 200 { updated: false } / 400 { error: ... }
    else new proxy instance
        Proxy->>Router: initialize routerOpts.config (defaults or provided)
        Proxy-->>Caller: proxy started
    end

    Caller->>Proxy: GET /__routing-config
    Proxy-->>Caller: current routerOpts.config (JSON)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR fully addresses the linked issue #147 by implementing the POST /__update-routing endpoint and updating the reuse path to apply routing config, ensuring user-provided config in openclaw.json is no longer silently ignored.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the routing config reuse issue: adding test coverage, implementing the update endpoint, and modifying the reuse path logic. No unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: apply routing config on proxy reuse path (defense-in-depth)' accurately and specifically describes the main change—applying routing config during proxy reuse, which is the core fix for issue #147 where custom config was ignored.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/proxy.ts (1)

1398-1416: Consider adding a timeout to the routing config update fetch.

The fetch call to update routing config on the existing proxy has no timeout. If the proxy is in a degraded state, this could hang indefinitely, blocking the startProxy call.

♻️ Proposed fix to add timeout
     // If a routing config was provided, push it to the running proxy so it takes effect
     if (options.routingConfig) {
+      const updateController = new AbortController();
+      const updateTimeout = setTimeout(() => updateController.abort(), HEALTH_CHECK_TIMEOUT_MS);
       try {
         const updateRes = await fetch(`${baseUrl}/__update-routing`, {
           method: "POST",
           headers: { "Content-Type": "application/json" },
           body: JSON.stringify({ routingConfig: options.routingConfig }),
+          signal: updateController.signal,
         });
+        clearTimeout(updateTimeout);
         if (!updateRes.ok) {
           console.warn(
             `[ClawRouter] Failed to update routing config on existing proxy: ${updateRes.status}`,
           );
         }
       } catch (err) {
+        clearTimeout(updateTimeout);
         console.warn(
           `[ClawRouter] Could not update routing config on existing proxy: ${err instanceof Error ? err.message : String(err)}`,
         );
       }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/proxy.ts` around lines 1398 - 1416, The fetch that posts to
`${baseUrl}/__update-routing` when options.routingConfig is present can hang;
wrap the request in an AbortController with a short configurable timeout (e.g.,
few seconds), pass controller.signal to fetch, and clear the timeout on
completion; update the catch to handle aborts gracefully (recognize AbortError)
while preserving the existing warning path for non-ok responses in the same
block that uses updateRes, modifying only the request invocation and error
handling around the fetch in the startProxy/routing update section that
references options.routingConfig and baseUrl.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/proxy.ts`:
- Around line 1568-1572: The handler currently always returns { updated: true }
even when no routing config was provided; change the logic around
body.routingConfig and mergeRoutingConfig so you only update routerOpts.config
and return { updated: true } when body.routingConfig is present and non-empty
(e.g., truthy and for objects check Object.keys(body.routingConfig).length > 0),
otherwise do not modify routerOpts.config and return a clear response such as
res.writeHead(200, ...) / res.end(JSON.stringify({ updated: false })) (or a 400
error if you prefer validation), making sure the code paths around
mergeRoutingConfig, routerOpts.config, res.writeHead and res.end reflect this
conditional behavior.

---

Nitpick comments:
In `@src/proxy.ts`:
- Around line 1398-1416: The fetch that posts to `${baseUrl}/__update-routing`
when options.routingConfig is present can hang; wrap the request in an
AbortController with a short configurable timeout (e.g., few seconds), pass
controller.signal to fetch, and clear the timeout on completion; update the
catch to handle aborts gracefully (recognize AbortError) while preserving the
existing warning path for non-ok responses in the same block that uses
updateRes, modifying only the request invocation and error handling around the
fetch in the startProxy/routing update section that references
options.routingConfig and baseUrl.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9bc2734a-c52e-48f0-b4cf-f37c0cf58683

📥 Commits

Reviewing files that changed from the base of the PR and between de8683e and 46c1b55.

📒 Files selected for processing (2)
  • src/proxy.routing-config-reuse.test.ts
  • src/proxy.ts

@kagura-agent kagura-agent force-pushed the fix/routing-config-ignored branch from 46c1b55 to 36ac46a Compare April 10, 2026 03:08
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/proxy.ts (1)

1600-1618: Missing timeout on routing config update fetch.

The fetch to /__update-routing lacks a timeout. If the existing proxy is in a degraded state, this could hang indefinitely. Consider using AbortSignal.timeout() similar to how checkExistingProxy uses HEALTH_CHECK_TIMEOUT_MS.

⏱️ Proposed fix to add timeout
     // If a routing config was provided, push it to the running proxy so it takes effect
     if (options.routingConfig) {
       try {
         const updateRes = await fetch(`${baseUrl}/__update-routing`, {
           method: "POST",
           headers: { "Content-Type": "application/json" },
           body: JSON.stringify({ routingConfig: options.routingConfig }),
+          signal: AbortSignal.timeout(HEALTH_CHECK_TIMEOUT_MS),
         });
         if (!updateRes.ok) {
           console.warn(
             `[ClawRouter] Failed to update routing config on existing proxy: ${updateRes.status}`,
           );
         }
       } catch (err) {
         console.warn(
           `[ClawRouter] Could not update routing config on existing proxy: ${err instanceof Error ? err.message : String(err)}`,
         );
       }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/proxy.ts` around lines 1600 - 1618, The fetch to update routing config
can hang because it has no timeout; update the block that posts to
`${baseUrl}/__update-routing` (the code using options.routingConfig and the
updateRes handling) to use an AbortSignal with a timeout (reuse
HEALTH_CHECK_TIMEOUT_MS and AbortSignal.timeout() like checkExistingProxy does),
pass signal into fetch, and ensure you abort/handle the timeout error in the
existing try/catch so the existing console.warn logs the error message when the
request times out.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/proxy.ts`:
- Around line 1600-1618: The fetch to update routing config can hang because it
has no timeout; update the block that posts to `${baseUrl}/__update-routing`
(the code using options.routingConfig and the updateRes handling) to use an
AbortSignal with a timeout (reuse HEALTH_CHECK_TIMEOUT_MS and
AbortSignal.timeout() like checkExistingProxy does), pass signal into fetch, and
ensure you abort/handle the timeout error in the existing try/catch so the
existing console.warn logs the error message when the request times out.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 60d8008a-935c-44f5-82b6-b4b81a861ee0

📥 Commits

Reviewing files that changed from the base of the PR and between 46c1b55 and 36ac46a.

📒 Files selected for processing (2)
  • src/proxy.routing-config-reuse.test.ts
  • src/proxy.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/proxy.routing-config-reuse.test.ts

@kagura-agent
Copy link
Copy Markdown
Contributor Author

@coderabbitai Valid point — fixed in e6c4181. The endpoint now returns { updated: false } when body.routingConfig is falsy, so callers can distinguish between an actual config update and a no-op.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 10, 2026

Tip

For best results, initiate chat on the files or code changes.

@kagura-agent: Thanks for the update! Returning { updated: false } for falsy routingConfig is a clean distinction — callers can now reliably detect whether the endpoint performed a meaningful merge or was a no-op, which will be especially useful for integration tests and diagnostics.


If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

@1bcMax
Copy link
Copy Markdown
Member

1bcMax commented Apr 10, 2026

Thanks for digging into this! After looking closely I think this PR fixes a real bug, but it's a different bug than #147 — and the actual #147 case was just fixed via a different path in v0.12.142.

What v0.12.142 fixes (the actual #147 case)

In the single-process OpenClaw plugin scenario that @jeegankones reported, the second register() call never reaches startProxy(). It bails out earlier in index.ts at the proxyAlreadyStarted guard:

// src/index.ts (pre-v0.12.142)
if (proxyAlreadyStarted) {
  api.logger.info("Proxy already started by earlier register() call — skipping");
  return;  // ← bails here, never reaches startProxy() / reuse path
}

So startProxy()'s reuse path is never triggered in the typical OpenClaw flow, and the __update-routing POST in this PR would never fire for the #147 case.

v0.12.142 fixes that by deferring the first register() call when pluginConfig is empty, so the second call (with the populated routingConfig, walletKey, maxCostPerRun, etc.) is the one that actually creates the proxy fresh.

What this PR does fix

startProxy()'s reuse path was genuinely broken — it returned the existing proxy handle without applying options.routingConfig. That matters in scenarios like:

  • A user runs npx @blockrun/clawrouter while OpenClaw is also running on port 8402
  • Multi-process / port-probe race conditions during gateway supervisor restarts
  • Any programmatic re-init in the same process where the port is already held

So this is still worth merging as defense-in-depth — but I'd suggest:

  1. Re-frame the PR description: this fixes the startProxy() reuse path, not the OpenClaw double-register flow. The "Fixes Custom routing config in openclaw.json is silently ignored #147" line is misleading because v0.12.142 already closed that issue via the index.ts defer mechanism.

  2. Limitation worth documenting: the hot-update endpoint can only update mutable router config. It can't re-apply walletKey / maxCostPerRun / paymentChain, since those are baked into the proxy at startup. v0.12.142's approach (start the proxy fresh with the right config) handles all of those.

  3. The __routing-config GET endpoint is genuinely useful for debugging — happy to see that land.

Want me to (a) merge as-is with an updated description, (b) wait for you to revise, or (c) close in favor of the v0.12.142 approach? Either way, thanks for the careful diff and the tests.

@kagura-agent
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough breakdown @1bcMax! You're right — the actual #147 flow never hits the reuse path, so the framing was off.

I'll go with (b): revise the description to frame it as a defense-in-depth fix for the startProxy() reuse path (not #147), and note the limitation re: immutable config (walletKey, maxCostPerRun, etc.). The __routing-config GET endpoint is a nice debugging bonus.

Will push the update shortly.

When register() is called twice (common in OpenClaw), the second call
with user config would hit the proxy reuse path and skip
mergeRoutingConfig(). Custom routing config from openclaw.json was
silently ignored.

Fix: Add /__update-routing POST endpoint to the running proxy server.
When the reuse path detects an existing proxy and options.routingConfig
is provided, POST the config to update the running instance.

Also adds /__routing-config GET endpoint for debugging.
@kagura-agent kagura-agent changed the title fix: apply routing config on proxy reuse path fix: apply routing config on proxy reuse path (defense-in-depth) Apr 10, 2026
@kagura-agent kagura-agent force-pushed the fix/routing-config-ignored branch from dcd7fbc to a343b8e Compare April 10, 2026 09:05
@kagura-agent
Copy link
Copy Markdown
Contributor Author

Fixed lint error — removed unused readStringSafe function.

@1bcMax
Copy link
Copy Markdown
Member

1bcMax commented Apr 10, 2026

Closing — #147 was fully fixed in v0.12.142 via the deferred-register-startup approach in index.ts. The startProxy() reuse path that this PR addresses isn't triggered in the actual OpenClaw plugin scenario for #147 (the second register() call bails at the proxyAlreadyStarted guard before reaching startProxy()).

The hot-update endpoint here is interesting work and would be useful if we ever support live config reloading via a separate channel (e.g. CLI command), but the current scenarios are covered by the v0.12.142 approach. Happy to revisit if/when we have a use case for runtime routing-config updates.

@1bcMax 1bcMax closed this Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants