Skip to content

feat: add codex executor for OpenAI GPT/Codex via bedrock-mantle#31

Open
sfreudenthaler wants to merge 7 commits into
mainfrom
feat/codex-mantle-executor
Open

feat: add codex executor for OpenAI GPT/Codex via bedrock-mantle#31
sfreudenthaler wants to merge 7 commits into
mainfrom
feat/codex-mantle-executor

Conversation

@sfreudenthaler

@sfreudenthaler sfreudenthaler commented Jun 9, 2026

Copy link
Copy Markdown
Member

Adds a codex executor so openai.* model IDs (openai.gpt-5.5, openai.gpt-5.4) route to OpenAI's GPT/Codex models on AWS. Part of dotCMS/Infrastructure-as-code#7836.

These models are not on bedrock-runtime — there is no InvokeModel/Converse. They are served only by the separate bedrock-mantle endpoint (OpenAI Responses API, https://bedrock-mantle.{region}.api.aws/v1/responses), so the existing generic Converse executor can't reach them.

Consumers change nothing but model_id — the orchestrator routes, the executor handles region/auth/streaming. Same "magic" as the provider-family pattern.

Auth: OpenAI SDK + short-term Bedrock bearer token minted from the OIDC session (no long-lived secret, ≤1h lifetime, nothing to clean up). See the decision + token-lifecycle notes on #7836.

What changed

  • codex-executor.yml (new):
    • Calls mantle with the OpenAI SDK authenticated by a short-term Bedrock bearer token minted in-process from the OIDC-assumed-role session (aws-bedrock-token-generator). The SDK can't consume SigV4, but a short-term key keeps the OIDC-only posture: derived from the STS session, no long-lived secret, not a stored resource (nothing to clean up), expires with the role session (≤1h), never written to env/disk/logs.
    • Streams SSE and accumulates response.output_text.delta. Streaming is mandatory — GPT-5.x reasons before emitting, so a non-streaming call buffers and looks like a 60–100s hang.
    • Remaps the orchestrator's us-east-1 default → us-east-2, where GPT-5.5/5.4 are served. The mantle endpoint exists in us-east-1, but the models are not there yet — verified live via the Models API (us-east-1 lists gpt-oss but no gpt-5*). GPT-5.4 also accepts explicit us-west-2.
    • Sends store: false for zero data retention — the Responses API otherwise defaults store: true (retains input+output 30 days in-region for previous_response_id chaining), which single-shot review doesn't need.
    • Reuses the /tmp sticky-comment helper + sticky_namespace. Adds reasoning_effort (default medium). max_output_tokens caps only the visible answer, not reasoning tokens.
    • Dependencies (openai + aws-bedrock-token-generator) are declared as PEP 723 inline script metadata and run via uv run (with astral-sh/setup-uv, cache enabled) — ephemeral, cached env; no system pip install.
  • claude-orchestrator.yml: anchored ^([a-z]+\.)?openai\. route → openai-mantlecodex-mantle job, checked before the generic fallback. Optional reasoning_effort pass-through.
  • CLAUDE.md / ARCHITECTURE.md: routing tables, executor docs, mermaid diagram.

Validation

  • actionlint (rhysd/actionlint:1.7.7) → exit 0
  • YAML parses (orchestrator + new executor)
  • Embedded Python compiles (py_compile)
  • Live-verified against the R&D account Models API: correct path is /v1 (not /openai/v1, which 404s), and GPT-5.5/5.4 are present in us-east-2 only. The raw SigV4-against-mantle primitive was proven in the auth spike (dotCMS/Infrastructure-as-code#7836). Pending: a full end-to-end executor run on a real PR (validates the exact Responses-API SSE event shapes + the narrow IAM grant) — see #7836 sequencing. Recommend piloting on dotCMS/core.

Ships with

dotCMS/Infrastructure-as-code IAM PR #7842 (bedrock-mantle:CreateInference + CallWithBearerToken scoped to SHORT_TERM) — merge/apply as a pair so IAM is never open without an execution path.

🤖 Generated with Claude Code

sfreudenthaler and others added 5 commits June 9, 2026 18:23
Route openai.* model_ids (openai.gpt-5.5, openai.gpt-5.4) to a new
codex-executor. These models are served only by the bedrock-mantle
endpoint (OpenAI Responses API), not bedrock-runtime, so the generic
Converse executor can't reach them.

- codex-executor.yml: SigV4-signed (service name "bedrock", no bearer
  token) STREAMING call to the Responses API; accumulates SSE
  response.output_text.delta. Remaps us-east-1 -> us-east-2 (mantle
  region). Reuses the /tmp sticky-comment helper + sticky_namespace.
  reasoning_effort input (default medium). botocore installed at runtime
  (AWS CLI v2 bundle doesn't expose it to system python).
- claude-orchestrator.yml: anchored ^([a-z]+\.)?openai\. route ->
  openai-mantle -> codex-mantle job (checked before the generic
  fallback). Optional reasoning_effort pass-through. Consumers change
  only model_id.
- CLAUDE.md / ARCHITECTURE.md: routing tables, executor docs, diagram.

IAM provisioned in dotCMS/infrastructure-as-code#7836. Streaming is
mandatory (GPT-5.x reasons before emitting); max_output_tokens does not
cap reasoning tokens. Auth posture confirmed by the spike on #7836.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ationale

Verified live against the R&D account Models API:
- Endpoint path is /v1/responses, NOT /openai/v1/responses (the latter
  404s). Fixed the endpoint URL + header/docs.
- The bedrock-mantle endpoint IS available in us-east-1, but GPT-5.5/5.4
  are served only in us-east-2 (us-east-1 lists gpt-oss but no gpt-5*).
  The us-east-1 -> us-east-2 remap stays (routes to where the models
  live), but the rationale comments are corrected (it's model
  availability, not endpoint availability).
- Send store=false for zero data retention. The Responses API defaults
  store=true, which retains input+output 30 days in-region for
  previous_response_id chaining; single-shot review doesn't need state
  and shouldn't retain the diff.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per the auth-path decision, move the executor from hand-rolled SigV4 +
urllib to the OpenAI SDK. The SDK can't consume SigV4, so authenticate
with a SHORT-TERM Bedrock bearer token minted in-process from the
OIDC-assumed-role session via aws-bedrock-token-generator. The token is
OIDC-derived (no long-lived secret), expires with the role session
(<=1h), is not a stored resource (nothing to clean up), and is never
written to env/disk/logs.

- Replace the SigV4 helper with client.responses.create(stream=True);
  accumulate response.output_text.delta, read usage off response.completed.
- Install openai + aws-bedrock-token-generator at runtime (was botocore).
- Keep store=false, reasoning_effort, region remap, sticky comment.
- Docs (CLAUDE.md/ARCHITECTURE.md) updated to the SDK + bearer path.

Requires bedrock-mantle:CallWithBearerToken (SHORT_TERM) in the IAM PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Run the mantle review script via `uv run` with dependencies declared as
PEP 723 inline script metadata, instead of a system `pip install`. Adds
an astral-sh/setup-uv step (with cache). uv provisions openai +
aws-bedrock-token-generator (and Python) into an ephemeral cached env;
no system-python pollution, faster and reproducible across runs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… file)

Document that inlining mantle_review.py to /tmp is deliberate: this is a
cross-repo reusable workflow, so actions/checkout pulls the consumer's
repo and a relative local action/script reference resolves against the
consumer, not ai-workflows. Shipping it as a real file would need a
fully-qualified composite action or a self-checkout at a pinned ref —
avoided to keep the executor self-contained and version-locked.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sadly all the inline scripts have to stay because of the way uses: and relative paths work in github actions when using reusable workflows. the tl;dr is that they resolve to the callers local repo / checkout not the ai-workflows one. so those other files outside the action wouldn't be reachable without some ugly kludges.

Created a new issue in #32 that would explore moving to a full blown github app approach

@sfreudenthaler sfreudenthaler marked this pull request as ready for review June 10, 2026 00:35
@sfreudenthaler sfreudenthaler requested review from a team as code owners June 10, 2026 00:35
@sfreudenthaler sfreudenthaler requested a review from a team as a code owner June 10, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant