Skip to content

fix(proxy): route builtin responses through chat hot path#207

Open
awsl233777 wants to merge 11 commits into
mainfrom
fix/builtin-proxy-sub2api-hot-path
Open

fix(proxy): route builtin responses through chat hot path#207
awsl233777 wants to merge 11 commits into
mainfrom
fix/builtin-proxy-sub2api-hot-path

Conversation

@awsl233777

@awsl233777 awsl233777 commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • route builtin /v1/responses requests directly through the sub2api-style /v1/chat/completions hot path
  • align streaming Chat→Responses SSE lifecycle with sub2api (response.created on first chat chunk, no synthetic response.in_progress event, reasoning/content/tool item lifecycles closed correctly)
  • keep accepted downstream SSE transport alive with : keepalive while an upstream chat stream is idle before the first chunk, without faking Responses events
  • add accident-focused stability coverage for slow first chunk, idle-before-first-chat-chunk, client aborts, full completion, CPU delta, and upstream stream leak checks

Validation

  • node --test tests/unit/builtin-proxy-responses-shim.test.mjs — 24/24 passed, TARGET_UNIT_EXIT=0
  • npm run test:e2eE2E_EXIT=0; stability run logged elapsed_ms=808 cpu_user_us=10912 cpu_system_us=3683 metrics={"opened":14,"closed":14,"active":0,"maxActive":9}
  • npm run test:unit — All 715 tests passed, UNIT_EXIT=0
  • npm run lint — lint passed for 229 files, LINT_EXIT=0
  • git diff --checkDIFF_CHECK_EXIT=0

Notes

  • The e2e upstream intentionally returns an error from /responses; the test asserts no requests hit that endpoint and all conversion traffic goes to /chat/completions.
  • The idle-stream keepalive opens only the SSE transport. It deliberately does not emit response.created until the first upstream chat chunk arrives, preserving the sub2api event ordering.

Summary by CodeRabbit

  • New Features

    • Consistent /v1/responses handling via chat-completions conversion, including tool/function-call mapping and DSML tool-call extraction for streaming SSE.
  • Bug Fixes

    • Removed upstream /v1/responses probing; improved streaming transport hardening (keepalive/teardown/header safety) and retry eligibility for transient network/timeout cases.
    • Refined SSE event sequencing and finalization to ensure correct response.created/reasoning/tool/output deltas; improved handling of DSML/text buffering and error completion behavior.
  • Tests

    • Added E2E coverage for Codex/Sub2API conversion (modes, timing, concurrency, and passthrough errors).
    • Expanded unit tests for routing, retries, DSML tool-call conversion, and stricter SSE expectations.

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@awsl233777, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 6 minutes and 48 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fc6f13d1-ddc0-43df-9807-af98554af105

📥 Commits

Reviewing files that changed from the base of the PR and between 54d4b81 and ec07f51.

📒 Files selected for processing (2)
  • cli/openai-bridge.js
  • tests/unit/openai-bridge-upstream-responses.test.mjs
📝 Walkthrough

Walkthrough

The proxy now converts /v1/responses through /v1/chat/completions for streaming and non-streaming requests. The PR also adds E2E and unit coverage for routing, timing, retry, and DSML tool-call conversion.

Changes

Responses to chat conversion

Layer / File(s) Summary
Routing and retry helpers
cli/builtin-proxy.js, cli/openai-bridge.js
The /v1/responses route stops probing upstream Responses, retry classification expands, and streaming completion/failure handling changes.
Streaming SSE conversion
cli/builtin-proxy.js, cli/openai-bridge.js
The streaming path adds DSML parsing, reasoning and tool lifecycle events, buffered text emission, transport setup, keepalive timing changes, abort handling, and final SSE completion updates.
E2E runner and mock upstream
tests/e2e/run.js, tests/e2e/test-builtin-proxy-codex-conversion.js
The E2E runner loads the new conversion test, and the test starts a mock upstream, exercises non-stream, tool, stream, timing, keepalive, concurrency, and error cases, and rejects /nested/api/responses calls.
Unit proxy assertions
tests/unit/builtin-proxy-responses-shim.test.mjs
The builtin-proxy unit tests assert direct chat-completions routing, Codex identity headers, streaming timing, retry behavior, reasoning mapping, and DSML conversion output.
Unit bridge assertions
tests/unit/openai-bridge-upstream-responses.test.mjs
The openai-bridge unit tests check streaming completion output, retry behavior for chat fallback, and DSML tool-call conversion through the Responses SSE path.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • ymkiux

Poem

I hopped from responses, soft and bright,
Through chat-completion streams of light.
: keepalive winked, [DONE] came near,
Tiny DSML paths are rabbit-clear. 🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: routing builtin /v1/responses traffic through the chat-completions hot path.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/builtin-proxy-sub2api-hot-path

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cli/builtin-proxy.js (1)

143-151: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Retry the proxy’s own timeout errors.

Line 223 and Line 1632 emit error: 'timeout', but this classifier does not match that value, so retryTransientRequest stops after the first timed-out /chat/completions attempt.

Suggested fix
 function isTransientNetworkError(error) {
     const text = String(error || '').trim();
     if (!text) return false;
+    if (/timeout|timed out/i.test(text)) return true;
     if (/socket hang up/i.test(text)) return true;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/builtin-proxy.js` around lines 143 - 151, The transient error classifier
in isTransientNetworkError does not recognize the proxy’s own timeout marker, so
retryTransientRequest treats timed-out /chat/completions failures as
non-retryable. Update the timeout detection logic in isTransientNetworkError in
builtin-proxy.js to return true when the error text is the proxy’s emitted
timeout value (error: 'timeout'), alongside the existing socket/TLS/network
checks, so timeout failures are retried consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cli/builtin-proxy.js`:
- Around line 143-151: The transient error classifier in isTransientNetworkError
does not recognize the proxy’s own timeout marker, so retryTransientRequest
treats timed-out /chat/completions failures as non-retryable. Update the timeout
detection logic in isTransientNetworkError in builtin-proxy.js to return true
when the error text is the proxy’s emitted timeout value (error: 'timeout'),
alongside the existing socket/TLS/network checks, so timeout failures are
retried consistently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 50a6be28-37d9-4f28-aeb1-ff1504bc3e36

📥 Commits

Reviewing files that changed from the base of the PR and between 3d4e522 and d8799a7.

📒 Files selected for processing (4)
  • cli/builtin-proxy.js
  • tests/e2e/run.js
  • tests/e2e/test-builtin-proxy-codex-conversion.js
  • tests/unit/builtin-proxy-responses-shim.test.mjs
📜 Review details
🧰 Additional context used
🪛 ast-grep (0.44.0)
tests/e2e/run.js

[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: require('child_process')
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').

(detect-child-process)


[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: require('child_process')
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').

(detect-child-process)

tests/e2e/test-builtin-proxy-codex-conversion.js

[warning] 37-130: Use https protocol over http
Context: http.createServer((req, res) => {
const requestPath = String(req.url || '').split('?')[0];
let body = '';
req.setEncoding('utf-8');
req.on('data', chunk => body += chunk);
req.on('end', () => {
let parsedBody = null;
if (body.trim()) {
try { parsedBody = JSON.parse(body); } catch (_) { parsedBody = null; }
}
requests.push({
method: req.method,
path: requestPath,
headers: req.headers,
body: parsedBody
});

            if (req.method === 'GET' && requestPath === '/nested/api/models') {
                const payload = JSON.stringify({ data: [{ id: 'gpt-sub2api-e2e' }] });
                res.writeHead(200, {
                    'Content-Type': 'application/json; charset=utf-8',
                    'Content-Length': Buffer.byteLength(payload, 'utf-8')
                });
                res.end(payload, 'utf-8');
                return;
            }

            if (req.method === 'POST' && requestPath === '/nested/api/responses') {
                const payload = JSON.stringify({ error: { message: 'responses endpoint must not be used' } });
                res.writeHead(500, {
                    'Content-Type': 'application/json; charset=utf-8',
                    'Content-Length': Buffer.byteLength(payload, 'utf-8')
                });
                res.end(payload, 'utf-8');
                return;
            }

            if (req.method === 'POST' && requestPath === '/nested/api/chat/completions') {
                if (parsedBody && parsedBody.model === 'error-model') {
                    const payload = JSON.stringify({ error: { message: 'upstream rate limited' } });
                    res.writeHead(429, {
                        'Content-Type': 'application/json; charset=utf-8',
                        'Content-Length': Buffer.byteLength(payload, 'utf-8')
                    });
                    res.end(payload, 'utf-8');
                    return;
                }

                if (parsedBody && parsedBody.stream === true) {
                    res.writeHead(200, { 'Content-Type': 'text/event-stream; charset=utf-8' });
                    res.write('data: {"id":"chatcmpl_sub2api_stream","model":"gpt-sub2api-e2e","choices":[{"delta":{"role":"assistant"}}]}\n\n');
                    res.write('data: {"id":"chatcmpl_sub2api_stream","model":"gpt-sub2api-e2e","choices":[{"delta":{"content":"stream-"}}]}\n\n');
                    res.write('data: {"id":"chatcmpl_sub2api_stream","model":"gpt-sub2api-e2e","choices":[{"delta":{"content":"ok"}}]}\n\n');
                    res.end('data: [DONE]\n\n');
                    return;
                }

                const hasTools = parsedBody && Array.isArray(parsedBody.tools) && parsedBody.tools.length > 0;
                const payload = JSON.stringify({
                    id: hasTools ? 'chatcmpl_sub2api_tool' : 'chatcmpl_sub2api_text',
                    model: parsedBody && parsedBody.model ? parsedBody.model : 'unknown-model',
                    choices: [{
                        finish_reason: hasTools ? 'tool_calls' : 'stop',
                        message: hasTools ? {
                            role: 'assistant',
                            content: null,
                            tool_calls: [{
                                id: 'call_lookup_e2e',
                                type: 'function',
                                function: { name: 'lookup', arguments: '{"q":"codexmate"}' }
                            }]
                        } : {
                            role: 'assistant',
                            content: 'direct-chat-ok'
                        }
                    }],
                    usage: { prompt_tokens: 5, completion_tokens: 7, total_tokens: 12 }
                });
                res.writeHead(200, {
                    'Content-Type': 'application/json; charset=utf-8',
                    'Content-Length': Buffer.byteLength(payload, 'utf-8')
                });
                res.end(payload, 'utf-8');
                return;
            }

            const notFound = JSON.stringify({ error: { message: 'not found' } });
            res.writeHead(404, {
                'Content-Type': 'application/json; charset=utf-8',
                'Content-Length': Buffer.byteLength(notFound, 'utf-8')
            });
            res.end(notFound, 'utf-8');
        });
    })

Note: [CWE-319] Cleartext Transmission of Sensitive Information. Security best practice.

(https-protocol-missing)

🔇 Additional comments (4)
tests/unit/builtin-proxy-responses-shim.test.mjs (1)

131-148: LGTM!

Also applies to: 168-169, 496-507, 542-551, 1390-1410, 1436-1440

tests/e2e/run.js (1)

19-19: LGTM!

Also applies to: 155-155

tests/e2e/test-builtin-proxy-codex-conversion.js (1)

178-179: 🎯 Functional Correctness

No change needed for the non-stream lookup The non-stream body already sets stream: false, so req.body.stream === false is safe.

cli/builtin-proxy.js (1)

957-959: LGTM!

Also applies to: 1642-1644, 2008-2012, 2037-2059, 2061-2070

@awsl233777

Copy link
Copy Markdown
Collaborator Author

Updated after CodeRabbit feedback:

  • Classified plain timeout errors as transient network failures so the proxy retries its own timeout results from upstream chat requests.
  • Added a regression test that simulates the first /v1/chat/completions attempt hanging until proxy timeout and verifies the second attempt recovers.

Validation after the update:

  • node --test tests/unit/builtin-proxy-responses-shim.test.mjs — 22/22 passed
  • npm run test:unit — All 713 tests passed, UNIT_EXIT=0
  • npm run test:e2eE2E_EXIT=0
  • npm run lint — lint passed for 229 files, LINT_EXIT=0
  • git diff --check — no whitespace errors

@coderabbitai re-review

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@awsl233777 Thanks for the update and for sharing the validation results. I’ll re-review the latest changes now.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@awsl233777

Copy link
Copy Markdown
Collaborator Author

Updated the builtin Responses streaming path for the stability issue reported in chat:

  • Open downstream Responses SSE immediately before waiting for upstream /v1/chat/completions headers.
  • Emit early response.created with status: "in_progress" / output: [], followed by response.in_progress, matching the stable Responses stream shape expected by Codex-style clients.
  • Keep the downstream stream alive every 5s while waiting for upstream output.
  • Track client disconnects and abort upstream requests even while the proxy is still waiting for upstream headers.
  • Preserve retry behavior for pre-output upstream aborts/timeouts after the downstream SSE has already opened.
  • Added unit and e2e regressions where upstream intentionally delays streaming headers; the proxy must still emit response.created / response.in_progress within 500ms and must not call upstream /responses.

Validation after this update:

  • node --test tests/unit/builtin-proxy-responses-shim.test.mjs — 23/23 passed
  • npm run test:e2eE2E_EXIT=0
  • npm run test:unit — All 714 tests passed, UNIT_EXIT=0
  • npm run lint — lint passed for 229 files, LINT_EXIT=0
  • git diff --check — no whitespace errors

@coderabbitai re-review

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@awsl233777 Thanks for the detailed update and validation results. I’ll review the latest changes to the Responses streaming path now, including the early SSE open, keepalive behavior, disconnect/abort handling, and the new regression coverage.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@awsl233777

Copy link
Copy Markdown
Collaborator Author

Updated the streaming bridge to align with sub2api's Chat Completions → Responses lifecycle instead of synthesizing a custom stream shape.

What changed:

  • Removed the custom response.in_progress event from chat fallback streaming.
  • Emit response.created from the first upstream chat stream chunk, including role-only chunks, matching sub2api's ensureChatToResponsesCreated behavior.
  • Added reasoning summary part lifecycle: response.reasoning_summary_part.added before reasoning deltas and response.reasoning_summary_part.done before closing the reasoning item.
  • Close reasoning before opening message/tool output, matching sub2api item ordering.
  • Emit message output_item.added + content_part.added before response.output_text.delta.
  • Emit tool-call output_item.added and streamed response.function_call_arguments.delta as argument chunks arrive, then aggregate in response.function_call_arguments.done and output_item.done at finalization.
  • Preserve retry behavior before any output item is started; JSON fallback now still emits a single response.created after retry recovery.
  • Updated the slow-stream regression to model sub2api behavior: first upstream role chunk arrives quickly, content is delayed, and the proxy emits response.created before delayed content without inventing response.in_progress.

Validation after this update:

  • node --test tests/unit/builtin-proxy-responses-shim.test.mjs — 23/23 passed
  • npm run test:e2eE2E_EXIT=0
  • npm run test:unit — All 714 tests passed, UNIT_EXIT=0
  • npm run lint — lint passed for 229 files, LINT_EXIT=0
  • git diff --check — no whitespace errors

@coderabbitai re-review

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@awsl233777 Thanks for the detailed update and validation results. I’ll review the latest streaming bridge changes now, with focus on the sub2api-aligned Responses event lifecycle, early response.created behavior, reasoning/tool ordering, and retry semantics before output starts.


✏️ Learnings added
Learnt from: awsl233777
URL: https://github.com/SakuraByteCore/codexmate/pull/207

Timestamp: 2026-06-27T08:40:32.845Z
Learning: In the builtin proxy Responses streaming bridge for `cli/builtin-proxy.js`, the expected fallback behavior is to align with sub2api's Chat Completions → Responses lifecycle: do not synthesize a custom `response.in_progress` event, emit `response.created` from the first upstream chat stream chunk (including role-only chunks), and preserve sub2api-style ordering by finishing reasoning items before opening message or tool output items.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/e2e/test-builtin-proxy-codex-conversion.js (1)

212-212: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Avoid picking the proxy port from a fixed random range.

20000 + Math.floor(Math.random() * 1000) can collide with another process in CI, so this test can fail before it exercises the proxy at all. Prefer an OS-assigned free port or reserve one explicitly first.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/test-builtin-proxy-codex-conversion.js` at line 212, The proxy port
selection in the e2e test is using a fixed random range, which can still collide
in CI. Update the test setup around the proxyPort assignment to obtain an
OS-assigned free port or reserve a port explicitly before starting the proxy,
and keep the change localized to the proxy test flow so the rest of the test
continues to use the chosen port.
🧹 Nitpick comments (2)
cli/builtin-proxy.js (1)

2244-2254: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Unreachable dead code: this if (wantsStream) block can never execute.

wantsStream is const (Line 2185) and the streaming case already returns inside the earlier if (wantsStream) { ... return; } block (Lines 2193-2215). By the time execution reaches Line 2244, wantsStream is always false, so this branch (and its sendResponsesSse/event-stream write) is dead. Drop it and keep only the JSON response.

♻️ Proposed cleanup
                     const responsesPayload = buildResponsesPayloadFromChatCompletion(chatJson.value, model, { toolTypesByName });
-
-                    if (wantsStream) {
-                        res.writeHead(200, {
-                            'Content-Type': 'text/event-stream; charset=utf-8',
-                            'Cache-Control': 'no-cache',
-                            'Connection': 'keep-alive',
-                            'X-Accel-Buffering': 'no'
-                        });
-                        sendResponsesSse(res, responsesPayload);
-                        res.end();
-                        return;
-                    }
                     res.writeHead(200, { 'Content-Type': 'application/json; charset=utf-8' });
                     res.end(JSON.stringify(responsesPayload));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/builtin-proxy.js` around lines 2244 - 2254, Remove the unreachable
streaming branch in builtin-proxy.js: the later if (wantsStream) block around
the response handling is dead because wantsStream is a const and the earlier
streaming path already returns after sendResponsesSse. Update the response flow
so only the JSON response path remains there, and keep the logic centered around
the existing request handling code that computes wantsStream and
sendResponsesSse.
tests/e2e/test-builtin-proxy-codex-conversion.js (1)

276-285: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

The slow-stream regression stops before it verifies delayed output.

requestUntilText(..., 'event: response.created', ...) resolves on the first marker and requestUntilText destroys the request immediately afterward, so a proxy that emits response.created and then stalls forever would still pass here. Consider following this with a full-body read (or a helper that waits for both the early marker and [DONE]) so the delayed-content path is actually covered.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/test-builtin-proxy-codex-conversion.js` around lines 276 - 285, The
slow-stream test in requestUntilText only proves the early response.created
marker arrives and then cancels the request, so it never validates delayed
output. Update the test around earlyStreamResponse to keep reading after the
first marker or use a helper that waits for both the early SSE event and the
eventual [DONE] terminator, so the slow-content path is actually exercised and
verified.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/e2e/test-builtin-proxy-codex-conversion.js`:
- Line 49: The timed-out probe path in the E2E helper leaves the probe request
open, which can leak the SSE socket and flake later steps. Update the timeout
branch in the probe setup (the setTimeout that calls finish in
test-builtin-proxy-codex-conversion.js) to explicitly cancel/close the in-flight
req before resolving the timeout result, and make sure the cleanup happens for
the timeout path as well as normal completion.

In `@tests/unit/builtin-proxy-responses-shim.test.mjs`:
- Around line 1697-1699: The SSE assertions in the builtin proxy responses shim
test are too specific about how function-call arguments are chunked, so update
the test around the response.function_call_arguments.delta checks to verify the
full reconstructed arguments content instead of a particular delta split
pattern. Keep the event-name assertion in the relevant test case, but replace
the two delta-partition regex matches with assertions that tolerate any valid
re-chunking while still confirming the final JSON arguments for the function
call are correct.
- Around line 1222-1251: Loosen the timeout gate in the slow-stream test so it
matches the intended contract instead of failing too early. In the test around
the earlySse/proxyPort request flow, increase the setTimeout used to finish the
probe and keep the assertions focused on “response.created arrives before
delayed content” by checking the first upstream chat chunk behavior in
builtin-proxy-responses-shim.test.mjs. Update the elapsed-time expectation to
allow normal CI jitter while still ensuring the SSE is emitted before the
delayed content from the proxy path.

---

Outside diff comments:
In `@tests/e2e/test-builtin-proxy-codex-conversion.js`:
- Line 212: The proxy port selection in the e2e test is using a fixed random
range, which can still collide in CI. Update the test setup around the proxyPort
assignment to obtain an OS-assigned free port or reserve a port explicitly
before starting the proxy, and keep the change localized to the proxy test flow
so the rest of the test continues to use the chosen port.

---

Nitpick comments:
In `@cli/builtin-proxy.js`:
- Around line 2244-2254: Remove the unreachable streaming branch in
builtin-proxy.js: the later if (wantsStream) block around the response handling
is dead because wantsStream is a const and the earlier streaming path already
returns after sendResponsesSse. Update the response flow so only the JSON
response path remains there, and keep the logic centered around the existing
request handling code that computes wantsStream and sendResponsesSse.

In `@tests/e2e/test-builtin-proxy-codex-conversion.js`:
- Around line 276-285: The slow-stream test in requestUntilText only proves the
early response.created marker arrives and then cancels the request, so it never
validates delayed output. Update the test around earlyStreamResponse to keep
reading after the first marker or use a helper that waits for both the early SSE
event and the eventual [DONE] terminator, so the slow-content path is actually
exercised and verified.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ac286d6d-4360-408b-b256-a34f15b5c8d0

📥 Commits

Reviewing files that changed from the base of the PR and between 109c01b and bf47435.

📒 Files selected for processing (3)
  • cli/builtin-proxy.js
  • tests/e2e/test-builtin-proxy-codex-conversion.js
  • tests/unit/builtin-proxy-responses-shim.test.mjs
📜 Review details
🧰 Additional context used
🪛 ast-grep (0.44.0)
tests/e2e/test-builtin-proxy-codex-conversion.js

[warning] 48-48: Avoid using the initial state variable in setState
Context: setTimeout(() => finish({ statusCode: 0, body: text, firstDataElapsedMs, timedOut: true }), timeoutMs)
Note: [CWE-710] Improper Adherence to Coding Standards. Security best practice.

(setstate-same-var)

🔇 Additional comments (3)
cli/builtin-proxy.js (3)

143-153: LGTM!


1342-1419: LGTM!


1487-1533: LGTM!

Comment thread tests/e2e/test-builtin-proxy-codex-conversion.js Outdated
Comment thread tests/unit/builtin-proxy-responses-shim.test.mjs Outdated
Comment thread tests/unit/builtin-proxy-responses-shim.test.mjs Outdated
@awsl233777

Copy link
Copy Markdown
Collaborator Author

Stability follow-up pushed in 331db16.

What changed:

  • Opens the downstream SSE transport as soon as an upstream chat SSE stream is accepted, and sends a legal : keepalive while waiting for the first upstream chat chunk.
  • Preserves the sub2api event lifecycle: no synthetic response.in_progress event, and no response.created until the first upstream chat chunk arrives.
  • Adds accident-focused coverage for idle-before-first-chat-chunk, client aborts, full delayed completion, CPU delta, and upstream stream leak checks.

Validation actually run:

  • node --test tests/unit/builtin-proxy-responses-shim.test.mjs — 24/24 passed, TARGET_UNIT_EXIT=0
  • npm run test:e2eE2E_EXIT=0
    • stability log: elapsed_ms=808 cpu_user_us=10912 cpu_system_us=3683 metrics={"opened":14,"closed":14,"active":0,"maxActive":9}
  • npm run test:unit — All 715 tests passed, UNIT_EXIT=0
  • npm run lint — lint passed for 229 files, LINT_EXIT=0
  • git diff --checkDIFF_CHECK_EXIT=0

@coderabbitai re-review

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@awsl233777 Thanks for the detailed follow-up and the validation results. I’ll re-review the latest changes in 331db16, with focus on the early SSE transport open/keepalive behavior, preserving the sub2api event lifecycle, and the new abort/leak/stability coverage.


✏️ Learnings added
Learnt from: awsl233777
URL: https://github.com/SakuraByteCore/codexmate/pull/207

Timestamp: 2026-06-27T10:07:29.883Z
Learning: In `cli/builtin-proxy.js`, the builtin Responses streaming bridge is expected to open the downstream SSE transport as soon as the upstream chat SSE stream is accepted and may send legal `: keepalive` comments while idle, but it must still preserve the sub2api lifecycle by not emitting synthetic `response.in_progress` and not emitting `response.created` until the first upstream chat stream chunk arrives.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/unit/builtin-proxy-responses-shim.test.mjs (1)

1311-1315: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Assert this idle-stream path never probes /v1/responses.

The handler returns 404 but does not record/assert usage, so this test could still pass if the proxy probes /v1/responses and then falls back to chat.

Proposed test tightening
+    let responsesHit = false;
     const upstream = http.createServer((req, res) => {
         if (req.url === '/v1/responses' && req.method === 'POST') {
+            responsesHit = true;
             res.writeHead(404, { 'Content-Type': 'application/json' });
             res.end(JSON.stringify({ error: 'responses endpoint unavailable' }));
             return;
@@
         assert.match(early.text, /: keepalive/);
         assert.doesNotMatch(early.text, /event: response\.created/);
         assert.doesNotMatch(early.text, /event: response\.in_progress/);
+        assert.equal(responsesHit, false, 'proxy should not probe upstream /v1/responses before chat streaming');

Also applies to: 1376-1381

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/builtin-proxy-responses-shim.test.mjs` around lines 1311 - 1315,
The idle-stream fallback test is not verifying that the proxy avoids probing
/v1/responses, so it can pass even if the proxy still makes that request before
falling back. Update the upstream test server in
builtin-proxy-responses-shim.test.mjs to explicitly record or assert any
/v1/responses POST attempts in the relevant idle-stream cases, and then fail the
test if that path is hit; use the existing upstream request handler around the
idle-stream scenarios so the assertion is tied to the proxy behavior being
tested.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/builtin-proxy-responses-shim.test.mjs`:
- Around line 1350-1351: The early probe timeout path in the keepalive test
leaves the HTTP request open, so update the timeout handling around the
timer/http.request probe to explicitly destroy or abort the request before
calling finish when the 500ms timeout fires. Make the fix in the probe setup
logic so the request lifecycle is cleaned up on timeout, and ensure the existing
finish callback still reports the timedOut result only after the request has
been torn down.

---

Nitpick comments:
In `@tests/unit/builtin-proxy-responses-shim.test.mjs`:
- Around line 1311-1315: The idle-stream fallback test is not verifying that the
proxy avoids probing /v1/responses, so it can pass even if the proxy still makes
that request before falling back. Update the upstream test server in
builtin-proxy-responses-shim.test.mjs to explicitly record or assert any
/v1/responses POST attempts in the relevant idle-stream cases, and then fail the
test if that path is hit; use the existing upstream request handler around the
idle-stream scenarios so the assertion is tied to the proxy behavior being
tested.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d4acf102-9f71-40ae-a612-a442249b8c62

📥 Commits

Reviewing files that changed from the base of the PR and between bf47435 and 331db16.

📒 Files selected for processing (3)
  • cli/builtin-proxy.js
  • tests/e2e/test-builtin-proxy-codex-conversion.js
  • tests/unit/builtin-proxy-responses-shim.test.mjs
🚧 Files skipped from review as they are similar to previous changes (1)
  • cli/builtin-proxy.js
📜 Review details
🧰 Additional context used
🪛 ast-grep (0.44.0)
tests/e2e/test-builtin-proxy-codex-conversion.js

[warning] 83-83: Avoid using the initial state variable in setState
Context: setTimeout(resolve, ms)
Note: [CWE-710] Improper Adherence to Coding Standards. Security best practice.

(setstate-same-var)

🔇 Additional comments (2)
tests/e2e/test-builtin-proxy-codex-conversion.js (1)

83-117: LGTM!

Also applies to: 170-191, 245-245, 333-390

tests/unit/builtin-proxy-responses-shim.test.mjs (1)

1310-1310: LGTM!

Also applies to: 1316-1349, 1352-1388

Comment thread tests/unit/builtin-proxy-responses-shim.test.mjs Outdated
@awsl233777

Copy link
Copy Markdown
Collaborator Author

Addressed the latest CodeRabbit test-stability feedback in 3d70475.

What changed:

  • Destroy timeout probes in both unit and e2e helpers so missed markers do not leave HTTP/SSE requests open.
  • Use an OS-assigned free port for the e2e builtin proxy instead of a fixed random range.
  • Assert the idle-before-first-chat-chunk unit path does not probe upstream /v1/responses.
  • Reconstruct streamed function-call argument deltas instead of asserting one specific chunk split.
  • Loosen the slow-stream wall-clock guard to the actual contract: early SSE before the 1200ms delayed content.

Validation actually run:

  • node --test tests/unit/builtin-proxy-responses-shim.test.mjs — 24/24 passed, TARGET_UNIT_EXIT=0
  • npm run test:e2eE2E_EXIT=0
    • stability log: elapsed_ms=812 cpu_user_us=11920 cpu_system_us=2588 metrics={"opened":14,"closed":14,"active":0,"maxActive":4}
  • npm run test:unit — All 715 tests passed, UNIT_EXIT=0
  • npm run lint — lint passed for 229 files, LINT_EXIT=0
  • git diff --checkDIFF_CHECK_EXIT=0

@coderabbitai re-review

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@awsl233777 Thanks for the follow-up and the detailed validation results. I’ll re-review 3d70475 with focus on the test-stability fixes: timeout-probe cleanup, OS-assigned port usage, the no-/v1/responses idle-stream assertion, function-call delta reconstruction, and the relaxed slow-stream timing contract.

(•ᴗ•)🐇


✏️ Learnings added
Learnt from: awsl233777
URL: https://github.com/SakuraByteCore/codexmate/pull/207

Timestamp: 2026-06-27T10:16:47.876Z
Learning: In `tests/unit/builtin-proxy-responses-shim.test.mjs` for the builtin proxy Responses streaming bridge, streamed `response.function_call_arguments.delta` chunks should be treated as arbitrarily split; tests should reconstruct the full arguments payload instead of asserting a specific chunk boundary pattern.

Learnt from: awsl233777
URL: https://github.com/SakuraByteCore/codexmate/pull/207

Timestamp: 2026-06-27T10:16:47.876Z
Learning: In the builtin proxy end-to-end tests for this repository, the e2e proxy listener should use an OS-assigned free port instead of choosing from a fixed random range to reduce test flakiness and port-collision risk.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/e2e/test-builtin-proxy-codex-conversion.js (2)

372-407: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Assert the CPU guard instead of only logging it.

This computes process.cpuUsage() but never fails on a busy-loop regression, so the claimed CPU stability coverage is observational only. Add a generous ceiling or remove the metric from the test contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/test-builtin-proxy-codex-conversion.js` around lines 372 - 407, The
stability test in the idle stream flow currently records CPU usage but only logs
it, so it won’t fail on a busy-loop regression. In the test block around the
existing process.cpuUsage() measurement and the final console.log, add an
assertion on the CPU delta (using the stabilityCpu values from the idle stream
run) with a generous ceiling, or otherwise drop CPU usage from the test contract
entirely. Keep the check aligned with the existing idle-stream assertions in
this test so the CPU guard is enforced by the same stability scenario.

86-99: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Avoid the port handoff race
getAvailableTcpPort() closes the socket before proxy-start binds it, leaving a TOCTOU window in both uses here and at 286-294. proxy-start also rejects port: 0, so the fix needs to move to the startup path if you want to remove the flake.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e/test-builtin-proxy-codex-conversion.js` around lines 86 - 99, The
race is in getAvailableTcpPort and the proxy-start setup: the helper closes the
temporary server before proxy-start can bind, creating a TOCTOU window, and
proxy-start currently rejects port 0. Update the startup path around proxy-start
so it acquires and binds a real available port directly instead of handing off a
released socket, and adjust the getAvailableTcpPort usage so the selected port
stays reserved until the server starts. Use the getAvailableTcpPort helper and
proxy-start port handling as the main symbols to locate the changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/e2e/test-builtin-proxy-codex-conversion.js`:
- Around line 372-407: The stability test in the idle stream flow currently
records CPU usage but only logs it, so it won’t fail on a busy-loop regression.
In the test block around the existing process.cpuUsage() measurement and the
final console.log, add an assertion on the CPU delta (using the stabilityCpu
values from the idle stream run) with a generous ceiling, or otherwise drop CPU
usage from the test contract entirely. Keep the check aligned with the existing
idle-stream assertions in this test so the CPU guard is enforced by the same
stability scenario.
- Around line 86-99: The race is in getAvailableTcpPort and the proxy-start
setup: the helper closes the temporary server before proxy-start can bind,
creating a TOCTOU window, and proxy-start currently rejects port 0. Update the
startup path around proxy-start so it acquires and binds a real available port
directly instead of handing off a released socket, and adjust the
getAvailableTcpPort usage so the selected port stays reserved until the server
starts. Use the getAvailableTcpPort helper and proxy-start port handling as the
main symbols to locate the changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ad0a9d66-9dac-4e57-b135-5900a23935fe

📥 Commits

Reviewing files that changed from the base of the PR and between 331db16 and 3d70475.

📒 Files selected for processing (2)
  • tests/e2e/test-builtin-proxy-codex-conversion.js
  • tests/unit/builtin-proxy-responses-shim.test.mjs
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/builtin-proxy-responses-shim.test.mjs
📜 Review details
🧰 Additional context used
🪛 ast-grep (0.44.0)
tests/e2e/test-builtin-proxy-codex-conversion.js

[warning] 87-87: Use https protocol over http
Context: http.createServer()
Note: [CWE-319] Cleartext Transmission of Sensitive Information. Security best practice.

(https-protocol-missing)

🔇 Additional comments (1)
tests/e2e/test-builtin-proxy-codex-conversion.js (1)

49-52: LGTM!

@awsl233777

Copy link
Copy Markdown
Collaborator Author

事故复现链路这轮又补了一刀,不按 review nitpick 走:

  • 修复 cli/builtin-proxy.js 里 Chat fallback streaming 的 DSML 工具调用泄漏:当上游把 <|DSML|tool_calls> 放进 delta.content 时,不再把内部工具调用标记作为 response.output_text.delta 透给 Codex/UI。
  • 新增内容 buffer + DSML parser:普通文本继续流式输出;完整 DSML tool-call block 会转换成 Responses function_call / response.function_call_arguments.* 事件。
  • 覆盖 SSE chunk 切分场景:新增单测模拟 <|DSML|tool_calls> 被切成两段 content chunk,断言最终 SSE 不含 DSML / invoke name,并正确生成 read 工具调用和参数。

已验证:

node --check cli/builtin-proxy.js
node --check tests/unit/builtin-proxy-responses-shim.test.mjs
node --test tests/unit/builtin-proxy-responses-shim.test.mjs   # 25/25, TARGET_UNIT_EXIT=0
npm run test:e2e                                             # E2E_EXIT=0
npm run test:unit                                            # 716 tests passed, UNIT_EXIT=0
npm run lint                                                 # LINT_EXIT=0
git diff --check                                             # DIFF_CHECK_EXIT=0

说明:这次修的是用户现场看到的 “DSML/tool_calls 内脏直接出现在正文 + Working 卡住” 这一类真实事故路径,不是只为了让 CI 变绿。

@awsl233777

Copy link
Copy Markdown
Collaborator Author

Update: fixed the real --bridge openai path as well.

Changes in this round:

  • Added DSML tool-call parsing/conversion to cli/openai-bridge.js chat fallback streaming.
  • Buffered delta.content so split DSML markers like <|DSM + L|tool_calls> do not leak as output text.
  • Converted parsed DSML invokes into Responses function-call events instead of response.output_text.delta.
  • Normalized wrapped path-like DSML parameter values, e.g. /data/data/.../home/\n codexmate.
  • Extended both builtin-proxy and openai-bridge unit coverage for:
    • multiple invokes in one DSML block
    • parameter attributes split across lines
    • path values split across lines
    • DSML marker split across SSE chunks
    • no DSML / invoke name text leaking into emitted SSE

Validation actually run:

  • node --check cli/openai-bridge.js
  • node --check tests/unit/openai-bridge-upstream-responses.test.mjs
  • node --test tests/unit/openai-bridge-upstream-responses.test.mjs — 27/27 passed
  • node --test tests/unit/builtin-proxy-responses-shim.test.mjs tests/unit/openai-bridge-upstream-responses.test.mjs — 52/52 passed
  • npm run test:e2e — passed
  • npm run test:unit — 716 tests passed
  • npm run lint — passed for 229 files
  • git diff --check — passed

Real provider re-test boundary:

  • Re-ran codex exec through local http://127.0.0.1:3737/bridge/openai/1/v1/responses after restarting the runtime with this code.
  • The second real run did not first reproduce the DSML leak, but it was stopped by upstream provider error: 403 Forbidden: Insufficient balance.
  • Therefore I am not claiming the full provider-1 complex Codex task completed end-to-end; it is currently blocked by upstream balance/quota.

@awsl233777

Copy link
Copy Markdown
Collaborator Author

Follow-up after real provider UX review: added bounded bridge-side retries for early streaming failures.

Why:

  • The real --bridge openai path can receive intermittent upstream failures such as 403 Forbidden: Insufficient balance before any usable SSE output starts.
  • Passing that raw HTTP status back to Codex makes the CLI show a slow Reconnecting... 1/10 loop, which is poor UX and obscures the actual provider error.

Changes in this round:

  • Added retry classification for early streaming failures in cli/openai-bridge.js:
    • transient network/timeout errors
    • 408 / 409 / 425 / 429
    • 5xx
    • 403 bodies that look like provider-pool quota/rate/temporary failures, including Insufficient balance
  • Retries are bounded by the existing retry delays before any downstream SSE output is committed.
  • If retries still fail, the bridge now returns a proper Responses SSE response.failed + [DONE] instead of a raw HTTP 403/502 to the Codex SSE client.
  • Once the upstream stream has been accepted/output has started, the bridge does not attempt blind replay, avoiding duplicated output/tool calls.

Regression coverage added/updated:

  • Hanging upstream /responses does not fall back to chat; it ends as SSE response.failed instead of raw 502.
  • Upstream /responses 403 Insufficient balance is retried three attempts, then returned as SSE response.failed.
  • Upstream /responses first 403 then success recovers and emits response.completed without response.failed.
  • Chat fallback 403 Insufficient balance is also retried before SSE failure.

Validation actually run:

  • node --check cli/openai-bridge.js
  • node --check tests/unit/openai-bridge-upstream-responses.test.mjs
  • node --test tests/unit/openai-bridge-upstream-responses.test.mjs — 30/30 passed
  • node --test tests/unit/builtin-proxy-responses-shim.test.mjs tests/unit/openai-bridge-upstream-responses.test.mjs — 55/55 passed
  • npm run test:e2e — passed
  • npm run test:unit — 716 tests passed
  • npm run lint — passed for 229 files
  • git diff --check — passed

Boundary:

  • This does not claim to fix the upstream provider's quota/balance instability.
  • It makes codexmate handle that instability more cleanly: retry briefly inside the bridge, then fail fast and explicitly if the provider still rejects.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
cli/builtin-proxy.js (3)

1123-1133: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Keep response.created in-progress and empty.

This emits the completed status and full output in response.created, before the later response.output_item.added events. That breaks the streamed lifecycle ordering this shim is trying to preserve; mirror beginChatStreamResponsesSse and only expose final output in response.completed.

Proposed fix
         if (!options.skipCreated) {
             writeSse(res, 'response.created', {
                 type: 'response.created',
                 response: {
                     id: responseId,
                     model,
                     created_at: response.created_at,
-                    status: response.status || 'in_progress',
-                    output: Array.isArray(response.output) ? response.output : []
+                    status: 'in_progress',
+                    output: []
                 }
             });
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/builtin-proxy.js` around lines 1123 - 1133, The `response.created` event
in the proxy stream is leaking completed state too early; adjust the `writeSse`
payload in `cli/builtin-proxy.js` so `response.created` stays in-progress and
empty, matching `beginChatStreamResponsesSse`. Keep `status` fixed to the
in-progress value and omit the full `output` here, then reserve the final status
and output population for `response.completed` only.

1684-1691: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Attach the client-close abort handler before waiting for upstream headers.

The close listener is only installed after upstream SSE is accepted. If the client disconnects during DNS/TLS/header wait, the upstream request remains alive until timeout.

Proposed fix
             });
             sharedState.upstreamReq = req;
+            if (res && typeof res.on === 'function' && !sharedState.closeListenerAttached) {
+                sharedState.closeListenerAttached = true;
+                res.on('close', () => {
+                    sharedState.clientClosed = true;
+                    stopChatStreamHeartbeat(sharedState);
+                    if (!sharedState.finished && sharedState.upstreamReq) {
+                        try { sharedState.upstreamReq.destroy(new Error('client aborted')); } catch (_) {}
+                    }
+                });
+            }
             req.setTimeout(timeoutMs, () => {
                 if (streamAccepted) return;
                 try { req.destroy(new Error('timeout')); } catch (_) {}
                 finish({ ok: false, error: 'timeout' });
             });
-            req.on('error', (err) => finish({ ok: false, error: err && err.message ? err.message : 'request failed' }));
+            req.on('error', (err) => {
+                if (sharedState.clientClosed || res.writableEnded || res.destroyed) {
+                    finish({ ok: true });
+                    return;
+                }
+                finish({ ok: false, error: err && err.message ? err.message : 'request failed' });
+            });

Also applies to: 1922-1929

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/builtin-proxy.js` around lines 1684 - 1691, The client-close abort
handler is attached too late in the streaming flow, so disconnects during
DNS/TLS/header waiting can leave the upstream request running. Move the
res.close listener setup in the relevant streaming path(s) in
cli/builtin-proxy.js so it is installed before awaiting upstream headers, and
ensure the existing state.clientClosed, stopChatStreamHeartbeat, and
state.upstreamReq.destroy('client aborted') logic still runs when the client
disconnects.

2335-2343: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Retry retryable upstream HTTP failures before surfacing the stream error.

The streaming route now relies on this chat-completions path, but status >= 400 results are returned with status, and retryTransientRequest stops on any positive status. That means 429/5xx and quota-style 403 failures before output are not retried.

Proposed fix
+    function isRetryableUpstreamStatus(status, bodyText = '') {
+        if (status === 408 || status === 409 || status === 425 || status === 429) return true;
+        if (status >= 500 && status <= 599) return true;
+        if (status === 403 && /quota|balance|credit|capacity|rate/i.test(String(bodyText || ''))) return true;
+        return false;
+    }
+
     async function retryTransientRequest(executor) {
         let lastResult = null;
         for (let attempt = 0; attempt <= TRANSIENT_RETRY_DELAYS_MS.length; attempt += 1) {
@@
             if (!result) return result;
             if (result.ok) return result;
             if (result.retry) return result;
-            if (result.status && result.status > 0) return result;
-            if (!result.retryTransient && !isTransientNetworkError(result.error)) return result;
+            if (result.retryTransient || isTransientNetworkError(result.error)) continue;
+            if (result.status && result.status > 0) return result;
         }
         return lastResult;
     }
                 if (status >= 400) {
                     upstreamRes.on('data', (chunk) => chunk && chunks.push(chunk));
-                    upstreamRes.on('end', () => finish({ ok: false, status, bodyText: chunks.length ? Buffer.concat(chunks).toString('utf-8') : '' }));
+                    upstreamRes.on('end', () => {
+                        const bodyText = chunks.length ? Buffer.concat(chunks).toString('utf-8') : '';
+                        finish({
+                            ok: false,
+                            status,
+                            bodyText,
+                            ...(isRetryableUpstreamStatus(status, bodyText) ? { retryTransient: true } : {})
+                        });
+                    });
                     return;
                 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/builtin-proxy.js` around lines 2335 - 2343, The streaming
chat-completions path is surfacing retryable upstream errors too early because
`streamChatCompletionsAsResponsesSseWithFallbackUrls` returns positive `status`
values that cause `retryTransientRequest` to stop immediately. Update the retry
handling around the `streamChatCompletionsAsResponsesSseWithFallbackUrls` call
in `builtin-proxy` so 429, 5xx, and quota-style 403 responses before any
streamed output are treated as transient and retried before failing the request.
Keep the fix localized to the streaming/chat-completions flow and ensure the
retry decision uses the upstream response status and body/error details
appropriately.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cli/openai-bridge.js`:
- Around line 1539-1574: The DSML buffering in
flushChatStreamContentBuffer/writeChatStreamContentSegment can grow without
bound when DSML_TOOL_CALLS_END never arrives, so add a maxBytes-based guard for
pending content. In flushChatStreamContentBuffer, track the buffered DSML block
size and trim or reject oversized incomplete blocks instead of keeping the full
buffer in state.contentBuffer; ensure the guard applies both when start/end
markers are missing and when buffering through writeChatStreamContentSegment.
Use the existing state fields and DSML marker handling to locate the change.

In `@tests/unit/openai-bridge-upstream-responses.test.mjs`:
- Around line 168-170: Wrap the shutdown logic in the
openai-bridge-upstream-responses test so it waits for server close completion
instead of just calling http.Server.close() directly. Update the cleanup around
bridge.close() and upstream.close() to use promise-based shutdown handling (for
example via a helper that resolves on the close callback) before removing
tmpDir. This should be done in the test cleanup path where bridge and upstream
are torn down.

---

Outside diff comments:
In `@cli/builtin-proxy.js`:
- Around line 1123-1133: The `response.created` event in the proxy stream is
leaking completed state too early; adjust the `writeSse` payload in
`cli/builtin-proxy.js` so `response.created` stays in-progress and empty,
matching `beginChatStreamResponsesSse`. Keep `status` fixed to the in-progress
value and omit the full `output` here, then reserve the final status and output
population for `response.completed` only.
- Around line 1684-1691: The client-close abort handler is attached too late in
the streaming flow, so disconnects during DNS/TLS/header waiting can leave the
upstream request running. Move the res.close listener setup in the relevant
streaming path(s) in cli/builtin-proxy.js so it is installed before awaiting
upstream headers, and ensure the existing state.clientClosed,
stopChatStreamHeartbeat, and state.upstreamReq.destroy('client aborted') logic
still runs when the client disconnects.
- Around line 2335-2343: The streaming chat-completions path is surfacing
retryable upstream errors too early because
`streamChatCompletionsAsResponsesSseWithFallbackUrls` returns positive `status`
values that cause `retryTransientRequest` to stop immediately. Update the retry
handling around the `streamChatCompletionsAsResponsesSseWithFallbackUrls` call
in `builtin-proxy` so 429, 5xx, and quota-style 403 responses before any
streamed output are treated as transient and retried before failing the request.
Keep the fix localized to the streaming/chat-completions flow and ensure the
retry decision uses the upstream response status and body/error details
appropriately.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6fac4490-dbfd-4522-a47c-1301015e22ed

📥 Commits

Reviewing files that changed from the base of the PR and between 3d70475 and 6b5dcc2.

📒 Files selected for processing (4)
  • cli/builtin-proxy.js
  • cli/openai-bridge.js
  • tests/unit/builtin-proxy-responses-shim.test.mjs
  • tests/unit/openai-bridge-upstream-responses.test.mjs
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/builtin-proxy-responses-shim.test.mjs
📜 Review details
🧰 Additional context used
🪛 OpenGrep (1.23.0)
cli/openai-bridge.js

[ERROR] 1451-1451: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.

(coderabbit.command-injection.exec-js)


[ERROR] 1476-1476: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.

(coderabbit.command-injection.exec-js)


[ERROR] 1484-1484: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.

(coderabbit.command-injection.exec-js)

cli/builtin-proxy.js

[ERROR] 1259-1259: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.

(coderabbit.command-injection.exec-js)


[ERROR] 1284-1284: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.

(coderabbit.command-injection.exec-js)


[ERROR] 1292-1292: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.

(coderabbit.command-injection.exec-js)

🔇 Additional comments (5)
tests/unit/openai-bridge-upstream-responses.test.mjs (1)

156-166: LGTM!

Also applies to: 173-227, 234-290, 297-351, 610-680

cli/openai-bridge.js (3)

1134-1151: LGTM!

Also applies to: 1251-1304


1413-1537: LGTM!


1618-1623: LGTM!

Also applies to: 1641-1641, 1865-1865, 2402-2402, 2413-2413, 2440-2442

cli/builtin-proxy.js (1)

143-153: LGTM!

Also applies to: 1057-1078, 1221-1239, 1242-1301, 1403-1497, 1519-1538, 1550-1563, 1568-1650, 1668-1717, 1805-1819, 1855-1872, 1880-1902, 2304-2308

Comment thread cli/openai-bridge.js
Comment on lines +1539 to +1574
function flushChatStreamContentBuffer(state, force = false) {
if (!state || typeof state.contentBuffer !== 'string' || !state.contentBuffer) return;
let buffer = state.contentBuffer;
state.contentBuffer = '';
while (buffer) {
const start = buffer.indexOf(DSML_TOOL_CALLS_START);
if (start < 0) {
const keep = force ? 0 : longestDsmlStartPrefixSuffix(buffer);
const text = keep > 0 ? buffer.slice(0, buffer.length - keep) : buffer;
emitChatStreamTextDelta(state, text);
state.contentBuffer = keep > 0 ? buffer.slice(buffer.length - keep) : '';
return;
}
if (start > 0) {
emitChatStreamTextDelta(state, buffer.slice(0, start));
buffer = buffer.slice(start);
}
const end = buffer.indexOf(DSML_TOOL_CALLS_END, DSML_TOOL_CALLS_START.length);
if (end < 0) {
if (force) {
state.contentBuffer = '';
} else {
state.contentBuffer = buffer;
}
return;
}
const blockText = buffer.slice(DSML_TOOL_CALLS_START.length, end);
appendDsmlToolCallsToChatStreamState(state, blockText);
buffer = buffer.slice(end + DSML_TOOL_CALLS_END.length);
}
}

function writeChatStreamContentSegment(state, segment) {
if (!state || typeof segment !== 'string' || !segment) return;
state.contentBuffer = (typeof state.contentBuffer === 'string' ? state.contentBuffer : '') + segment;
flushChatStreamContentBuffer(state, false);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Bound incomplete DSML buffers.

Line 1561 retains the entire pending DSML block until the end marker arrives. Since successful streaming chunks are not capped by maxBytes, a malformed upstream stream can grow state.contentBuffer indefinitely and stall output.

Proposed guard
 const DSML_TOOL_CALLS_START = '<|DSML|tool_calls>';
 const DSML_TOOL_CALLS_END = '</|DSML|tool_calls>';
+const DSML_CONTENT_BUFFER_MAX_BYTES = 1024 * 1024;
@@
         if (end < 0) {
             if (force) {
                 state.contentBuffer = '';
             } else {
+                if (Buffer.byteLength(buffer, 'utf8') > DSML_CONTENT_BUFFER_MAX_BYTES) {
+                    state.contentBuffer = '';
+                    failChatStreamResponsesSse(state, 'DSML tool-call block exceeded buffer limit');
+                    return;
+                }
                 state.contentBuffer = buffer;
             }
             return;
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function flushChatStreamContentBuffer(state, force = false) {
if (!state || typeof state.contentBuffer !== 'string' || !state.contentBuffer) return;
let buffer = state.contentBuffer;
state.contentBuffer = '';
while (buffer) {
const start = buffer.indexOf(DSML_TOOL_CALLS_START);
if (start < 0) {
const keep = force ? 0 : longestDsmlStartPrefixSuffix(buffer);
const text = keep > 0 ? buffer.slice(0, buffer.length - keep) : buffer;
emitChatStreamTextDelta(state, text);
state.contentBuffer = keep > 0 ? buffer.slice(buffer.length - keep) : '';
return;
}
if (start > 0) {
emitChatStreamTextDelta(state, buffer.slice(0, start));
buffer = buffer.slice(start);
}
const end = buffer.indexOf(DSML_TOOL_CALLS_END, DSML_TOOL_CALLS_START.length);
if (end < 0) {
if (force) {
state.contentBuffer = '';
} else {
state.contentBuffer = buffer;
}
return;
}
const blockText = buffer.slice(DSML_TOOL_CALLS_START.length, end);
appendDsmlToolCallsToChatStreamState(state, blockText);
buffer = buffer.slice(end + DSML_TOOL_CALLS_END.length);
}
}
function writeChatStreamContentSegment(state, segment) {
if (!state || typeof segment !== 'string' || !segment) return;
state.contentBuffer = (typeof state.contentBuffer === 'string' ? state.contentBuffer : '') + segment;
flushChatStreamContentBuffer(state, false);
const DSML_CONTENT_BUFFER_MAX_BYTES = 1024 * 1024;
function flushChatStreamContentBuffer(state, force = false) {
if (!state || typeof state.contentBuffer !== 'string' || !state.contentBuffer) return;
let buffer = state.contentBuffer;
state.contentBuffer = '';
while (buffer) {
const start = buffer.indexOf(DSML_TOOL_CALLS_START);
if (start < 0) {
const keep = force ? 0 : longestDsmlStartPrefixSuffix(buffer);
const text = keep > 0 ? buffer.slice(0, buffer.length - keep) : buffer;
emitChatStreamTextDelta(state, text);
state.contentBuffer = keep > 0 ? buffer.slice(buffer.length - keep) : '';
return;
}
if (start > 0) {
emitChatStreamTextDelta(state, buffer.slice(0, start));
buffer = buffer.slice(start);
}
const end = buffer.indexOf(DSML_TOOL_CALLS_END, DSML_TOOL_CALLS_START.length);
if (end < 0) {
if (force) {
state.contentBuffer = '';
} else {
if (Buffer.byteLength(buffer, 'utf8') > DSML_CONTENT_BUFFER_MAX_BYTES) {
state.contentBuffer = '';
failChatStreamResponsesSse(state, 'DSML tool-call block exceeded buffer limit');
return;
}
state.contentBuffer = buffer;
}
return;
}
const blockText = buffer.slice(DSML_TOOL_CALLS_START.length, end);
appendDsmlToolCallsToChatStreamState(state, blockText);
buffer = buffer.slice(end + DSML_TOOL_CALLS_END.length);
}
}
function writeChatStreamContentSegment(state, segment) {
if (!state || typeof segment !== 'string' || !segment) return;
state.contentBuffer = (typeof state.contentBuffer === 'string' ? state.contentBuffer : '') + segment;
flushChatStreamContentBuffer(state, false);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/openai-bridge.js` around lines 1539 - 1574, The DSML buffering in
flushChatStreamContentBuffer/writeChatStreamContentSegment can grow without
bound when DSML_TOOL_CALLS_END never arrives, so add a maxBytes-based guard for
pending content. In flushChatStreamContentBuffer, track the buffered DSML block
size and trim or reject oversized incomplete blocks instead of keeping the full
buffer in state.contentBuffer; ensure the guard applies both when start/end
markers are missing and when buffering through writeChatStreamContentSegment.
Use the existing state fields and DSML marker handling to locate the change.

Comment thread tests/unit/openai-bridge-upstream-responses.test.mjs Outdated
@awsl233777

Copy link
Copy Markdown
Collaborator Author

Updated the Codex streaming bridge to match dedicated Codex Responses→Chat proxy behavior instead of guessing via upstream /responses passthrough.

Changes:

  • Route streaming Codex /v1/responses requests directly through /chat/completions conversion.
  • Let the local bridge own the terminal SSE contract: always emit response.completed + [DONE] for converted chat streams.
  • Convert upstream retry exhaustion / stream errors into completed error text instead of response.failed, avoiding Codex CLI reconnect loops.
  • Updated regression tests so streaming Codex requests must not hit upstream /responses, and must not emit response.failed for 403, idle stream, or early-close cases.

Validation run locally:

  • node --test tests/unit/openai-bridge-upstream-responses.test.mjs — 30/30 pass
  • node --test tests/unit/openai-bridge-upstream-responses.test.mjs tests/unit/builtin-proxy-codex-responses.test.mjs — pass
  • npm run test:e2e — pass
  • npm run lint — pass
  • npm run test:unit — All 716 tests passed
  • git diff --check — pass
  • Real Codex CLI smoke test against this branch runtime: codex exec --full-auto --skip-git-repo-check "Reply with exactly OK" returned OK; log grep found no Reconnecting, no response.failed, and no stream closed before response.completed.

@coderabbitai re-review

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@awsl233777 Thanks for the detailed update and validation, including the real Codex CLI smoke test. I’ll re-review the latest bridge changes now, with focus on the direct /chat/completions conversion path, the terminal SSE contract, and the no-response.failed Codex streaming behavior.


✏️ Learnings added
Learnt from: awsl233777
URL: https://github.com/SakuraByteCore/codexmate/pull/207

Timestamp: 2026-06-27T12:31:02.827Z
Learning: In `cli/openai-bridge.js`, streaming Codex `POST /v1/responses` requests are expected to bypass upstream `/v1/responses` passthrough and use local `/chat/completions` → Responses conversion instead; for these converted chat streams, the bridge should own the terminal SSE contract by emitting `response.completed` plus `data: [DONE]`, and should surface early upstream failure/retry exhaustion as completed error text rather than `response.failed` to avoid Codex CLI reconnect loops.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cli/openai-bridge.js (1)

2395-2408: 🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift

Stop retrying once SSE output has started.

streamChatCompletionsAsResponsesSse writes response.created before the upstream stream can time out, but the timeout handler still returns { ok: false, error: 'timeout' }. That makes retryTransientRequest re-run the stream on the same res after the response has already been committed. Guard the timeout path with a started flag and finalize the SSE stream instead of retrying once output has begun.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/openai-bridge.js` around lines 2395 - 2408, The retry loop around
retryTransientRequest in the streamChatCompletionsAsResponsesSse call is
re-running after SSE output has already started, which can write to the same res
twice. Add a started/output-begun guard in streamChatCompletionsAsResponsesSse
so the timeout path returns a non-retryable finalization once response.created
or any SSE data has been sent, and only allow { ok: false, error: 'timeout' }
before streaming begins. Use the existing streamChatCompletionsAsResponsesSse
and retryTransientRequest flow to stop retries after the first committed chunk
and finalize the SSE stream instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cli/openai-bridge.js`:
- Around line 2395-2408: The retry loop around retryTransientRequest in the
streamChatCompletionsAsResponsesSse call is re-running after SSE output has
already started, which can write to the same res twice. Add a
started/output-begun guard in streamChatCompletionsAsResponsesSse so the timeout
path returns a non-retryable finalization once response.created or any SSE data
has been sent, and only allow { ok: false, error: 'timeout' } before streaming
begins. Use the existing streamChatCompletionsAsResponsesSse and
retryTransientRequest flow to stop retries after the first committed chunk and
finalize the SSE stream instead.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 167615e1-e04e-470c-9643-c984a3f64146

📥 Commits

Reviewing files that changed from the base of the PR and between 6b5dcc2 and 54d4b81.

📒 Files selected for processing (2)
  • cli/openai-bridge.js
  • tests/unit/openai-bridge-upstream-responses.test.mjs
📜 Review details
🔇 Additional comments (6)
tests/unit/openai-bridge-upstream-responses.test.mjs (3)

177-179: await bridge.close() / await upstream.close() don't await socket drain (http.Server.close is callback-based, not a promise), so this teardown can leave open handles and make the suite flaky. This is the same issue previously raised, and it now recurs in the new tests (Lines 177-179, 308-311, 439-442).


182-367: LGTM!


376-753: LGTM!

cli/openai-bridge.js (3)

1727-1734: LGTM!


2382-2394: LGTM!


1134-1163: 🩺 Stability & Availability

No missing SSE terminator here. sendResponsesSse already emits data: [DONE] after response.completed, so this error path still closes the stream correctly.

			> Likely an incorrect or invalid review comment.

@awsl233777

Copy link
Copy Markdown
Collaborator Author

Addressed the latest CodeRabbit feedback in ec07f51.

Fixes:

  • Added a committed-output guard in streamChatCompletionsAsResponsesSse.
    • Before any SSE output is committed, timeout/error may still return ok: false so the existing bounded retry flow can retry safely.
    • After downstream SSE has started (response.created / headers committed), timeout/error now finalizes the current SSE stream as completed error text and returns ok: true, preventing replay on the same res.
  • Converted the openai-bridge-upstream-responses test server teardown to promise-based closeServer(...) so cleanup waits for http.Server.close callbacks.
  • Tightened the idle accepted-chat-SSE test to assert the upstream chat request is made exactly once, proving no retry occurs after committed SSE output.

Validation run locally:

  • node --check cli/openai-bridge.js
  • node --check tests/unit/openai-bridge-upstream-responses.test.mjs
  • node --test tests/unit/openai-bridge-upstream-responses.test.mjs — 30/30 pass
  • node --test tests/unit/openai-bridge-upstream-responses.test.mjs tests/unit/builtin-proxy-codex-responses.test.mjs — pass
  • npm run test:e2e — pass
  • npm run lint — pass
  • npm run test:unit — All 716 tests passed
  • git diff --check — pass
  • Real Codex CLI smoke test against this branch runtime: codex exec --full-auto --skip-git-repo-check "Reply with exactly OK" returned OK; log grep found no Reconnecting, no response.failed, and no stream closed before response.completed.

@coderabbitai re-review

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@awsl233777 Thanks for the update and the detailed validation. I’ll re-review ec07f51 with focus on the committed-output retry guard in streamChatCompletionsAsResponsesSse, the promise-based server teardown, and the “exactly once” idle SSE assertion.

(•ᴗ•)🐇


✏️ Learnings added
Learnt from: awsl233777
URL: https://github.com/SakuraByteCore/codexmate/pull/207

Timestamp: 2026-06-27T12:44:34.669Z
Learning: In `cli/openai-bridge.js`, `streamChatCompletionsAsResponsesSse` should only allow bounded retry before any downstream Responses SSE output is committed. After headers or `response.created` are committed, timeout/error paths should finalize the current SSE stream as completed error text and return success for that response flow, preventing a retry or replay on the same `res`.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant