feat: add codex executor for OpenAI GPT/Codex via bedrock-mantle#31
Open
sfreudenthaler wants to merge 7 commits into
Open
feat: add codex executor for OpenAI GPT/Codex via bedrock-mantle#31sfreudenthaler wants to merge 7 commits into
sfreudenthaler wants to merge 7 commits into
Conversation
Route openai.* model_ids (openai.gpt-5.5, openai.gpt-5.4) to a new codex-executor. These models are served only by the bedrock-mantle endpoint (OpenAI Responses API), not bedrock-runtime, so the generic Converse executor can't reach them. - codex-executor.yml: SigV4-signed (service name "bedrock", no bearer token) STREAMING call to the Responses API; accumulates SSE response.output_text.delta. Remaps us-east-1 -> us-east-2 (mantle region). Reuses the /tmp sticky-comment helper + sticky_namespace. reasoning_effort input (default medium). botocore installed at runtime (AWS CLI v2 bundle doesn't expose it to system python). - claude-orchestrator.yml: anchored ^([a-z]+\.)?openai\. route -> openai-mantle -> codex-mantle job (checked before the generic fallback). Optional reasoning_effort pass-through. Consumers change only model_id. - CLAUDE.md / ARCHITECTURE.md: routing tables, executor docs, diagram. IAM provisioned in dotCMS/infrastructure-as-code#7836. Streaming is mandatory (GPT-5.x reasons before emitting); max_output_tokens does not cap reasoning tokens. Auth posture confirmed by the spike on #7836. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ationale Verified live against the R&D account Models API: - Endpoint path is /v1/responses, NOT /openai/v1/responses (the latter 404s). Fixed the endpoint URL + header/docs. - The bedrock-mantle endpoint IS available in us-east-1, but GPT-5.5/5.4 are served only in us-east-2 (us-east-1 lists gpt-oss but no gpt-5*). The us-east-1 -> us-east-2 remap stays (routes to where the models live), but the rationale comments are corrected (it's model availability, not endpoint availability). - Send store=false for zero data retention. The Responses API defaults store=true, which retains input+output 30 days in-region for previous_response_id chaining; single-shot review doesn't need state and shouldn't retain the diff. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per the auth-path decision, move the executor from hand-rolled SigV4 + urllib to the OpenAI SDK. The SDK can't consume SigV4, so authenticate with a SHORT-TERM Bedrock bearer token minted in-process from the OIDC-assumed-role session via aws-bedrock-token-generator. The token is OIDC-derived (no long-lived secret), expires with the role session (<=1h), is not a stored resource (nothing to clean up), and is never written to env/disk/logs. - Replace the SigV4 helper with client.responses.create(stream=True); accumulate response.output_text.delta, read usage off response.completed. - Install openai + aws-bedrock-token-generator at runtime (was botocore). - Keep store=false, reasoning_effort, region remap, sticky comment. - Docs (CLAUDE.md/ARCHITECTURE.md) updated to the SDK + bearer path. Requires bedrock-mantle:CallWithBearerToken (SHORT_TERM) in the IAM PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Run the mantle review script via `uv run` with dependencies declared as PEP 723 inline script metadata, instead of a system `pip install`. Adds an astral-sh/setup-uv step (with cache). uv provisions openai + aws-bedrock-token-generator (and Python) into an ephemeral cached env; no system-python pollution, faster and reproducible across runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… file) Document that inlining mantle_review.py to /tmp is deliberate: this is a cross-repo reusable workflow, so actions/checkout pulls the consumer's repo and a relative local action/script reference resolves against the consumer, not ai-workflows. Shipping it as a real file would need a fully-qualified composite action or a self-checkout at a pinned ref — avoided to keep the executor self-contained and version-locked. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4 tasks
sfreudenthaler
commented
Jun 10, 2026
Member
Author
There was a problem hiding this comment.
sadly all the inline scripts have to stay because of the way uses: and relative paths work in github actions when using reusable workflows. the tl;dr is that they resolve to the callers local repo / checkout not the ai-workflows one. so those other files outside the action wouldn't be reachable without some ugly kludges.
Created a new issue in #32 that would explore moving to a full blown github app approach
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a codex executor so
openai.*model IDs (openai.gpt-5.5,openai.gpt-5.4) route to OpenAI's GPT/Codex models on AWS. Part of dotCMS/Infrastructure-as-code#7836.These models are not on bedrock-runtime — there is no
InvokeModel/Converse. They are served only by the separate bedrock-mantle endpoint (OpenAI Responses API,https://bedrock-mantle.{region}.api.aws/v1/responses), so the existing generic Converse executor can't reach them.What changed
codex-executor.yml(new):aws-bedrock-token-generator). The SDK can't consume SigV4, but a short-term key keeps the OIDC-only posture: derived from the STS session, no long-lived secret, not a stored resource (nothing to clean up), expires with the role session (≤1h), never written to env/disk/logs.response.output_text.delta. Streaming is mandatory — GPT-5.x reasons before emitting, so a non-streaming call buffers and looks like a 60–100s hang.us-east-1default → us-east-2, where GPT-5.5/5.4 are served. The mantle endpoint exists in us-east-1, but the models are not there yet — verified live via the Models API (us-east-1 lists gpt-oss but no gpt-5*). GPT-5.4 also accepts explicit us-west-2.store: falsefor zero data retention — the Responses API otherwise defaultsstore: true(retains input+output 30 days in-region forprevious_response_idchaining), which single-shot review doesn't need./tmpsticky-comment helper +sticky_namespace. Addsreasoning_effort(defaultmedium).max_output_tokenscaps only the visible answer, not reasoning tokens.openai+aws-bedrock-token-generator) are declared as PEP 723 inline script metadata and run viauv run(withastral-sh/setup-uv, cache enabled) — ephemeral, cached env; no systempipinstall.claude-orchestrator.yml: anchored^([a-z]+\.)?openai\.route →openai-mantle→codex-mantlejob, checked before the generic fallback. Optionalreasoning_effortpass-through.Validation
actionlint(rhysd/actionlint:1.7.7) → exit 0py_compile)/v1(not/openai/v1, which 404s), and GPT-5.5/5.4 are present in us-east-2 only. The raw SigV4-against-mantle primitive was proven in the auth spike (dotCMS/Infrastructure-as-code#7836). Pending: a full end-to-end executor run on a real PR (validates the exact Responses-API SSE event shapes + the narrow IAM grant) — see #7836 sequencing. Recommend piloting ondotCMS/core.Ships with
dotCMS/Infrastructure-as-code IAM PR #7842 (
bedrock-mantle:CreateInference+CallWithBearerTokenscoped toSHORT_TERM) — merge/apply as a pair so IAM is never open without an execution path.🤖 Generated with Claude Code