Skip to content

[pull] main from openai:main#58

Open
pull[bot] wants to merge 579 commits intokontext-dev:mainfrom
openai:main
Open

[pull] main from openai:main#58
pull[bot] wants to merge 579 commits intokontext-dev:mainfrom
openai:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Mar 12, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot locked and limited conversation to collaborators Mar 12, 2026
@pull pull bot added ⤵️ pull merge-conflict Sync PR has merge conflicts labels Mar 12, 2026
xl-openai and others added 27 commits March 23, 2026 12:57
Support disabling skills by name, primarily for plugin skills. We can’t
use the path, since plugin skill paths may change across versions.
## Summary
- route /realtime, Ctrl+C, and deleted realtime meters through the same
realtime stop path
- keep generic transcription placeholder cleanup free of realtime
shutdown side effects

## Testing
- Ran 
- Relied on CI for verification; did not run local tests

---------

Co-authored-by: Codex <noreply@openai.com>
Add `InterAgentCommunication` for v2 agent communication
…port (#15211)

- add `PreToolUse` hook for bash-like tool execution only at first
- block shell execution before dispatch with deny-only hook behavior
- introduces common.rs matcher framework for matching when hooks are run

example run:

```
› run three parallel echo commands, and the second one should echo "[block-pre-tool-use]" as a test


• Running the three echo commands in parallel now and I’ll report the output directly.

• Running PreToolUse hook: name for demo pre tool use hook

• Running PreToolUse hook: name for demo pre tool use hook

• Running PreToolUse hook: name for demo pre tool use hook

PreToolUse hook (completed)
  warning: wizard-tower PreToolUse demo inspected Bash: echo "first parallel echo"
  
PreToolUse hook (blocked)
  warning: wizard-tower PreToolUse demo blocked a Bash command on purpose.
  feedback: PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.

PreToolUse hook (completed)
  warning: wizard-tower PreToolUse demo inspected Bash: echo "third parallel echo"

• Ran echo "first parallel echo"
  └ first parallel echo

• Ran echo "third parallel echo"
  └ third parallel echo

• Three little waves went out in parallel.

  1. printed first parallel echo
  2. was blocked before execution because it contained the exact test string [block-pre-tool-use]
  3. printed third parallel echo

  There was also an unrelated macOS defaults warning around the successful commands, but the echoes
  themselves worked fine. If you want, I can rerun the second one with a slightly modified string so
  it passes cleanly.
```
Use `serde` to encode the inter agent communication to an assistant
message and use the decode to see if this is such a message

Note: this assume serde on small pattern is fast enough
## Summary
Adds support for approvals_reviewer to `Op::UserTurn` so we can migrate
`[CodexMessageProcessor::turn_start]` to use Op::UserTurn

## Testing
- [x] Adds quick test for the new field

Co-authored-by: Codex <noreply@openai.com>
## What changed
- adds a targeted snapshot test for rollback with contextual diffs in
`codex_tests.rs`
- snapshots the exact model-visible request input before the rolled-back
turn and on the follow-up request after rollback
- shows the duplicate developer and environment context pair appearing
again before the follow-up user message

## Why
Rollback currently rewinds the reference context baseline without
rewinding the live session overrides. On the next turn, the same
contextual diff is emitted again and duplicated in the request sent to
the model.

## Impact
- makes the regression visible in a canonical snapshot test
- keeps the snapshot on the shared `context_snapshot` path without
adding new formatting helpers
- gives a direct repro for future fixes to rollback/context
reconstruction

---------

Co-authored-by: Codex <noreply@openai.com>
Custom watcher that sends an InterAgentCommunication on end of turn
The new wait tool just returns `Wait timed out.` or `Wait completed.`.
The actual content is done through the notification watcher
## Summary
- add `ForkSnapshotMode` to `ThreadManager::fork_thread` so callers can
request either a committed snapshot or an interrupted snapshot
- share the model-visible `<turn_aborted>` history marker between the
live interrupt path and interrupted forks
- update the small set of direct fork callsites to pass
`ForkSnapshotMode::Committed`

Note: this enables /btw to work similarly as Esc to interrupt (hopefully
somewhat in distribution)

---------

Co-authored-by: Codex <noreply@openai.com>
## Summary
- add a new `codex-sandboxing` crate for sandboxing extraction work
- move the pure Linux sandbox argv builders and their unit tests out of
`codex-core`
- keep `core::landlock` as the spawn wrapper and update direct callers
to use `codex_sandboxing::landlock`

## Testing
- `cargo test -p codex-sandboxing`
- `cargo test -p codex-core landlock`
- `cargo test -p codex-cli debug_sandbox`
- `just argument-comment-lint`

## Notes
- this is step 1 of the move plan aimed at minimizing per-PR diffs
- no re-exports or no-op proxy methods were added
## Summary
- move macOS permission merging/intersection logic and tests from
`codex-core` into `codex-sandboxing`
- move seatbelt policy builders, permissions logic, SBPL assets, and
their tests into `codex-sandboxing`
- keep `codex-core` owning only the seatbelt spawn wrapper and switch
call sites to import the moved APIs directly

## Notes
- no re-exports added
- moved the seatbelt tests with the implementation so internal helpers
could stay private
- local verification is still finishing while this PR is open
…n error returned (#15478)

## Summary
- update the self-serve business usage-based limit message to direct
users to their admin for additional credits
- add a focused unit test for the self_serve_business_usage_based plan
branch

Added also: 

If you are at a rate limit but you still have credits, codex cli would
tell you to switch the model. We shouldnt do this if you have credits so
fixed this.

## Test
- launched the source-built CLI and verified the updated message is
shown for the self-serve business usage-based plan

![Test
screenshot](https://raw.githubusercontent.com/openai/codex/5cc3c013ef17ac5c66dfd9395c0d3c4837602231/docs/images/self-serve-business-usage-limit.png)
Add imagegen skill as built-in skill. Source: github.com/openai/skills
## Summary
- move the pure sandbox policy transform helpers from `codex-core` into
`codex-sandboxing`
- move the corresponding unit tests with the extracted implementation
- update `core` and `app-server` callers to import the moved APIs
directly, without re-exports or proxy methods

## Testing
- cargo test -p codex-sandboxing
- cargo test -p codex-core sandboxing
- cargo test -p codex-app-server --lib
- just fix -p codex-sandboxing
- just fix -p codex-core
- just fix -p codex-app-server
- just fmt
- just argument-comment-lint
Show all plugin marketplaces in the /plugins popup by removing the
`openai-curated` marketplace filter, and update plugin popup
copy/tests/snapshots to match the new behavior in both TUI codepaths.
## Summary
- raise the shell snapshot apply_patch helper timeout to avoid macOS CI
startup races
- increase the shared MCP app-server test read timeout so slow
initialize handshakes do not fail command_exec tests spuriously

## Testing
- cargo test -p codex-core
shell_command_snapshot_still_intercepts_apply_patch
- cargo test -p codex-app-server
command_exec_tty_implies_streaming_and_reports_pty_output

Co-authored-by: Codex <noreply@openai.com>
Add a `list_agents` for multi-agent v2, optionally path based

This return the task and status of each agent in the matched path
## Problem

Today `codex-network-proxy` rejects a global `*` in
`network.allowed_domains`, so there is no static way to configure a
denylist-only posture for public hosts. Users have to enumerate broad
allowlist patterns instead.

## Approach

- Make global wildcard acceptance field-specific: `allowed_domains` can
use `*`, while `denied_domains` still rejects a global wildcard.
- Keep the existing evaluation order, so explicit denies still win first
and local/private protections still apply unless separately enabled.
- Add coverage for the denylist-only behavior and update the README to
document it.

## Validation

- `just fmt`
- `cargo test -p codex-network-proxy` (full run had one unrelated flaky
telemetry test:
`network_policy::tests::emit_block_decision_audit_event_emits_non_domain_event`;
reran in isolation and it passed)
- `cargo test -p codex-network-proxy
network_policy::tests::emit_block_decision_audit_event_emits_non_domain_event
-- --exact --nocapture`
- `just fix -p codex-network-proxy`
- `just argument-comment-lint`
This PR completes the conversion of non-interactive `codex exec` to use
app server rather than directly using core events and methods.

### Summary
- move `codex-exec` off exec-owned `AuthManager` and `ThreadManager`
state
- route exec bootstrap, resume, and auth refresh through existing
app-server paths
- replace legacy `codex/event/*` decoding in exec with typed app-server
notification handling
- update human and JSONL exec output adapters to translate existing
app-server notifications only
- clean up "app server client" layer by eliminating support for legacy
notifications; this is no longer needed
- remove exposure of `authManager` and `threadManager` from "app server
client" layer

### Testing
- `exec` has pretty extensive unit and integration tests already, and
these all pass
- In addition, I asked Codex to put together a comprehensive manual set
of tests to cover all of the `codex exec` functionality (including
command-line options), and it successfully generated and ran these tests
etraut-openai and others added 30 commits March 29, 2026 17:54
## Summary
A Windows-only snapshot assertion in the app-server MCP startup warning
test compared the raw rendered path, so CI saw `C:\tmp\project` instead
of the normalized `/tmp/project` snapshot fixture.

## Fix
Route that snapshot assertion through the existing
`normalize_snapshot_paths(...)` helper so the test remains
platform-stable.
Add a mailbox we can use for inter-agent communication
`wait` is now based on it and don't take target anymore
## Why

`core/src/tools/spec.rs` still owned the pure `tool_search` and
`tool_suggest` spec builders even though that logic no longer needed
`codex-core` runtime state. This change continues the `codex-tools`
migration by moving the reusable discovery and suggestion spec
construction out of `codex-core` so `spec.rs` is left with the
core-owned policy decisions about when these tools are exposed and what
metadata is available.

## What changed

- Added `codex-rs/tools/src/tool_discovery.rs` with the shared
`tool_search` and `tool_suggest` spec builders, plus focused unit tests
in `tool_discovery_tests.rs`.
- Moved the shared `DiscoverableToolAction` and `DiscoverableToolType`
declarations into `codex-tools` so the `tool_suggest` handler and the
extracted spec builders use the same wire-model enums.
- Updated `core/src/tools/spec.rs` to translate `ToolInfo` and
`DiscoverableTool` values into neutral `codex-tools` inputs and delegate
the actual spec building there.
- Removed the old template-based description rendering helpers from
`core/src/tools/spec.rs` and deleted the now-dead helper methods in
`core/src/tools/discoverable.rs`.
- Updated `codex-rs/tools/README.md` to document that discovery and
suggestion models/spec builders now live in `codex-tools`.

## Test plan

- `cargo test -p codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-discovery-specs cargo test -p
codex-core --lib tools::spec::`
- `CARGO_TARGET_DIR=/tmp/codex-core-discovery-specs cargo test -p
codex-core --lib tools::handlers::tool_suggest::`
- `just argument-comment-lint`

## References

- #16154
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
- #16138
- #16141
## Why

`#16193` moved the pure `tool_search` and `tool_suggest` spec builders
into `codex-tools`, but `codex-core` still owned the shared
discoverable-tool model that those builders and the `tool_suggest`
runtime both depend on. This change continues the migration by moving
that reusable model boundary out of `codex-core` as well, so the
discovery/suggestion stack uses one shared set of types and
`core/src/tools` no longer needs its own `discoverable.rs` module.

## What changed

- Moved `DiscoverableTool`, `DiscoverablePluginInfo`, and
`filter_tool_suggest_discoverable_tools_for_client()` into
`codex-rs/tools/src/tool_discovery.rs` alongside the extracted
discovery/suggestion spec builders.
- Added `codex-app-server-protocol` as a `codex-tools` dependency so the
shared discoverable-tool model can own the connector-side `AppInfo`
variant directly.
- Updated `core/src/tools/handlers/tool_suggest.rs`,
`core/src/tools/spec.rs`, `core/src/tools/router.rs`,
`core/src/connectors.rs`, and `core/src/codex.rs` to consume the shared
`codex-tools` model instead of the old core-local declarations.
- Changed `core/src/plugins/discoverable.rs` to return
`DiscoverablePluginInfo` directly, moved the pure client-filter coverage
into `tool_discovery_tests.rs`, and deleted the old
`core/src/tools/discoverable.rs` module.
- Updated `codex-rs/tools/README.md` so the crate boundary documents
that `codex-tools` now owns the discoverable-tool models in addition to
the discovery/suggestion spec builders.

## Test plan

- `cargo test -p codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
codex-core --lib tools::handlers::tool_suggest::`
- `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
codex-core --lib tools::spec::`
- `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
codex-core --lib plugins::discoverable::`
- `just bazel-lock-check`
- `just argument-comment-lint`

## References

- #16193
- #16154
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
- #16138
- #16141
## Why

The Bazel-backed `argument-comment-lint` CI path had two gaps:

- Bazel wildcard target expansion skipped inline unit-test crates from
`src/` modules because the generated `*-unit-tests-bin` `rust_test`
targets are tagged `manual`.
- `argument-comment-mismatch` was still only a warning in the Bazel and
packaged-wrapper entrypoints, so a typoed `/*param_name*/` comment could
still pass CI even when the lint detected it.

That left CI blind to real linux-sandbox examples, including the missing
`/*local_port*/` comment in
`codex-rs/linux-sandbox/src/proxy_routing.rs` and typoed argument
comments in `codex-rs/linux-sandbox/src/landlock.rs`.

## What Changed

- Added `tools/argument-comment-lint/list-bazel-targets.sh` so Bazel
lint runs cover `//codex-rs/...` plus the manual `rust_test`
`*-unit-tests-bin` targets.
- Updated `just argument-comment-lint`, `rust-ci.yml`, and
`rust-ci-full.yml` to use that helper.
- Promoted both `argument-comment-mismatch` and
`uncommented-anonymous-literal-argument` to errors in every strict
entrypoint:
  - `tools/argument-comment-lint/lint_aspect.bzl`
  - `tools/argument-comment-lint/src/bin/argument-comment-lint.rs`
  - `tools/argument-comment-lint/wrapper_common.py`
- Added wrapper/bin coverage for the stricter lint flags and documented
the behavior in `tools/argument-comment-lint/README.md`.
- Fixed the now-covered callsites in
`codex-rs/linux-sandbox/src/proxy_routing.rs`,
`codex-rs/linux-sandbox/src/landlock.rs`, and
`codex-rs/core/src/shell_snapshot_tests.rs`.

This keeps the Bazel target expansion narrow while making the Bazel and
prebuilt-linter paths enforce the same strict lint set.

## Verification

- `python3 -m unittest discover -s tools/argument-comment-lint -p
'test_*.py'`
- `cargo +nightly-2025-09-18 test --manifest-path
tools/argument-comment-lint/Cargo.toml`
- `just argument-comment-lint`
)

- rework codex analytics crate to use reducer / publish architecture
- in anticipation of extensive codex analytics
## Why

Follow-up to #16106.

`argument-comment-lint` already runs as a native Bazel aspect on Linux
and macOS, but Windows is still the long pole in `rust-ci`. To move
Windows onto the same native Bazel lane, the toolchain split has to let
exec-side helper binaries build in an MSVC environment while still
linting repo crates as `windows-gnullvm`.

Pushing the Windows lane onto the native Bazel path exposed a second
round of Windows-only issues in the mixed exec-toolchain plumbing after
the initial wrapper/target fixes landed.

## What Changed

- keep the Windows lint lanes on the native Bazel/aspect path in
`rust-ci.yml` and `rust-ci-full.yml`
- add a dedicated `local_windows_msvc` platform for exec-side helper
binaries while keeping `local_windows` as the `windows-gnullvm` target
platform
- patch `rules_rust` so `repository_set(...)` preserves explicit
exec-platform constraints for the generated toolchains, keep the
Windows-specific bootstrap/direct-link fixes needed for the nightly lint
driver, and expose exec-side `rustc-dev` `.rlib`s to the MSVC sysroot
- register the custom Windows nightly toolchain set with MSVC exec
constraints while still exposing both `x86_64-pc-windows-msvc` and
`x86_64-pc-windows-gnullvm` targets
- enable `dev_components` on the custom Windows nightly repository set
so the MSVC exec helper toolchain actually downloads the
compiler-internal crates that `clippy_utils` needs
- teach `run-argument-comment-lint-bazel.sh` to enumerate concrete
Windows Rust rules, normalize the resulting labels, and skip explicitly
requested incompatible targets instead of failing before the lint run
starts
- patch `rules_rust` build-script env propagation so exec-side
`windows-msvc` helper crates drop forwarded MinGW include and linker
search paths as whole flag/path pairs instead of emitting malformed
`CFLAGS`, `CXXFLAGS`, and `LDFLAGS`
- export the Windows VS/MSVC SDK environment in `setup-bazel-ci` and
pass the relevant variables through `run-bazel-ci.sh` via `--action_env`
/ `--host_action_env` so Bazel build scripts can see the MSVC and UCRT
headers on native Windows runs
- add inline comments to the Windows `setup-bazel-ci` MSVC environment
export step so it is easier to audit how `vswhere`, `VsDevCmd.bat`, and
the filtered `GITHUB_ENV` export fit together
- patch `aws-lc-sys` to skip its standalone `memcmp` probe under Bazel
`windows-msvc` build-script environments, which avoids a Windows-native
toolchain mismatch that blocked the lint lane before it reached the
aspect execution
- patch `aws-lc-sys` to prefer its bundled `prebuilt-nasm` objects for
Bazel `windows-msvc` build-script runs, which avoids missing
`generated-src/win-x86_64/*.asm` runfiles in the exec-side helper
toolchain
- annotate the Linux test-only callsites in `codex-rs/linux-sandbox` and
`codex-rs/core` that the wider native lint coverage surfaced

## Patches

This PR introduces a large patch stack because the Windows Bazel lint
lane currently depends on behavior that upstream dependencies do not
provide out of the box in the mixed `windows-gnullvm` target /
`windows-msvc` exec-toolchain setup.

- Most of the `rules_rust` patches look like upstream candidates rather
than OpenAI-only policy. Preserving explicit exec-platform constraints,
forwarding the right MSVC/UCRT environment into exec-side build scripts,
exposing exec-side `rustc-dev` artifacts, and keeping the Windows
bootstrap/linker behavior coherent all look like fixes to the Bazel/Rust
integration layer itself.
- The two `aws-lc-sys` patches are more tactical. They special-case
Bazel `windows-msvc` build-script environments to avoid a `memcmp` probe
mismatch and missing NASM runfiles. Those may be harder to upstream
as-is because they rely on Bazel-specific detection instead of a general
Cargo/build-script contract.
- Short term, carrying these patches in-tree is reasonable because they
unblock a real CI lane and are still narrow enough to audit. Long term,
the goal should not be to keep growing a permanent local fork of either
dependency.
- My current expectation is that the `rules_rust` patches are less
controversial and should be broken out into focused upstream proposals,
while the `aws-lc-sys` patches are more likely to be temporary escape
hatches unless that crate wants a more general hook for hermetic build
systems.

Suggested follow-up plan:

1. Split the `rules_rust` deltas into upstream-sized PRs or issues with
minimized repros.
2. Revisit the `aws-lc-sys` patches during the next dependency bump and
see whether they can be replaced by an upstream fix, a crate upgrade, or
a cleaner opt-in mechanism.
3. Treat each dependency update as a chance to delete patches one by one
so the local patch set only contains still-needed deltas.

## Verification

- `./.github/scripts/run-argument-comment-lint-bazel.sh
--config=argument-comment-lint --keep_going`
- `RUNNER_OS=Windows
./.github/scripts/run-argument-comment-lint-bazel.sh --nobuild
--config=argument-comment-lint --platforms=//:local_windows
--keep_going`
- `cargo test -p codex-linux-sandbox`
- `cargo test -p codex-core shell_snapshot_tests`
- `just argument-comment-lint`

## References

- #16106
## Summary

`ExternalAuthRefresher` was still shaped around external ChatGPT auth:
`ExternalAuthTokens` always implied ChatGPT account metadata even when a
caller only needed a bearer token.

This PR generalizes that contract so bearer-only sources are
first-class, while keeping the existing ChatGPT paths strict anywhere we
persist or rebuild ChatGPT auth state.

## Motivation

This is the first step toward #15189.

The follow-on provider-auth work needs one shared external-auth contract
that can do both of these things:

- resolve the current bearer token before a request is sent
- return a refreshed bearer token after a `401`

That should not require a second token result type just because there is
no ChatGPT account metadata attached.

## What Changed

- change `ExternalAuthTokens` to carry `access_token` plus optional
`ExternalAuthChatgptMetadata`
- add helper constructors for bearer-only tokens and ChatGPT-backed
tokens
- add `ExternalAuthRefresher::resolve()` with a default no-op
implementation so refreshers can optionally provide the current token
before a request is sent
- keep ChatGPT-only persistence strict by continuing to require ChatGPT
metadata anywhere the login layer seeds or reloads ChatGPT auth state
- update the app-server bridge to construct the new token shape for
external ChatGPT auth refreshes

## Testing

- `cargo test -p codex-login`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16286).
* #16288
* #16287
* __->__ #16286
## Summary

`AuthManager` and `UnauthorizedRecovery` already own token resolution
and staged `401` recovery. The missing piece for provider auth was a
bearer-only mode that still fit that design, instead of pushing a second
auth abstraction into `codex-core`.

This PR keeps the design centered on `AuthManager`: it teaches
`codex-login` how to own external bearer auth directly so later provider
work can keep calling `AuthManager.auth()` and `UnauthorizedRecovery`.

## Motivation

This is the middle layer for #15189.

The intended design is still:

- `AuthManager` encapsulates token storage and refresh
- `UnauthorizedRecovery` powers staged `401` recovery
- all request tokens go through `AuthManager.auth()`

This PR makes that possible for provider-backed bearer tokens by adding
a bearer-only auth mode inside `AuthManager` instead of building
parallel request-auth plumbing in `core`.

## What Changed

- move `ModelProviderAuthInfo` into `codex-protocol` so `core` and
`login` share one config shape
- add `login/src/auth/external_bearer.rs`, which runs the configured
command, caches the bearer token in memory, and refreshes it after `401`
- add `AuthManager::external_bearer_only(...)` for provider-scoped
request paths that should use command-backed bearer auth without
mutating the shared OpenAI auth manager
- add `AuthManager::shared_with_external_chatgpt_auth_refresher(...)`
and rename the other `AuthManager` helpers that only apply to external
ChatGPT auth so the ChatGPT-only path is explicit at the call site
- keep external ChatGPT refresh behavior unchanged while ensuring
bearer-only external auth never persists to `auth.json`

## Testing

- `cargo test -p codex-login`
- `cargo test -p codex-protocol`





---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16287).
* #16288
* __->__ #16287
## Summary

Fixes #15189.

Custom model providers that set `requires_openai_auth = false` could
only use static credentials via `env_key` or
`experimental_bearer_token`. That is not enough for providers that mint
short-lived bearer tokens, because Codex had no way to run a command to
obtain a bearer token, cache it briefly in memory, and retry with a
refreshed token after a `401`.

This PR adds that provider config and wires it through the existing auth
design: request paths still go through `AuthManager.auth()` and
`UnauthorizedRecovery`, with `core` only choosing when to use a
provider-backed bearer-only `AuthManager`.

## Scope

To keep this PR reviewable, `/models` only uses provider auth for the
initial request in this change. It does **not** add a dedicated `401`
retry path for `/models`; that can be follow-up work if we still need it
after landing the main provider-token support.

## Example Usage

```toml
model_provider = "corp-openai"

[model_providers.corp-openai]
name = "Corp OpenAI"
base_url = "https://gateway.example.com/openai"
requires_openai_auth = false

[model_providers.corp-openai.auth]
command = "gcloud"
args = ["auth", "print-access-token"]
timeout_ms = 5000
refresh_interval_ms = 300000
```

The command contract is intentionally small:

- write the bearer token to `stdout`
- exit `0`
- any leading or trailing whitespace is trimmed before the token is used

## What Changed

- add `model_providers.<id>.auth` to the config model and generated
schema
- validate that command-backed provider auth is mutually exclusive with
`env_key`, `experimental_bearer_token`, and `requires_openai_auth`
- build a bearer-only `AuthManager` for `ModelClient` and
`ModelsManager` when a provider configures `auth`
- let normal Responses requests and realtime websocket connects use the
provider-backed bearer source through the same `AuthManager.auth()` path
- allow `/models` online refresh for command-auth providers and attach
the provider token to the initial `/models` request
- keep `auth.cwd` available as an advanced escape hatch and include it
in the generated config schema

## Testing

- `cargo test -p codex-core provider_auth_command`
- `cargo test -p codex-core
refresh_available_models_uses_provider_auth_token`
- `cargo test -p codex-core
test_deserialize_provider_auth_config_defaults`

## Docs

- `developers.openai.com/codex` should document the new
`[model_providers.<id>.auth]` block and the token-command contract
Fix the death of the end of turn watcher
Adds this:
```
properties.insert(
            "fork_turns".to_string(),
            JsonSchema::String {
                description: Some(
                    "Optional MultiAgentV2 fork mode. Use `none`, `all`, or a positive integer string such as `3` to fork only the most recent turns."
                        .to_string(),
                ),
            },
        );
        ```

---------

Co-authored-by: Codex <noreply@openai.com>
I noticed that
https://github.com/openai/codex/actions/workflows/rust-ci-full.yml
started failing on my own PR,
#16288, even though CI was green
when I merged it.

Apparently, it introduced a lint violation that was [correctly!] caught
by our Cargo-based clippy runner, but not our Bazel-based one.

My next step is to figure out the reason for the delta between the two
setups, but I wanted to get us green again quickly, first.
The TUI’s `/feedback` flow was still uploading directly through the
local feedback crate, which bypassed app-server behavior such as
auth-derived feedback tags like chatgpt_user_id and made TUI feedback
handling diverge from other clients. It also meant that remove TUI
sessions failed to upload the correct feedback logs and session details.

Testing: Manually tested `/feedback` flow and confirmed that it didn't
regress.
Run a DB clean-up more frequently with an incremental `VACCUM` in it
- add event for thread initialization
- thread/start, thread/fork, thread/resume
- feature flagged behind `FeatureFlag::GeneralAnalytics`
- does not yet support threads started by subagents

PR stack:
- --> [[telemetry] thread events
#15690](#15690)
- [[telemetry] subagent events
#15915](#15915)
- [[telemetry] turn events
#15591](#15591)
- [[telemetry] steer events
#15697](#15697)
- [[telemetry] queued prompt data
#15804](#15804)


Sample extracted logs in Codex-backend
```
INFO     | 2026-03-29 16:39:37 | codex_backend.routers.analytics_events | analytics_events.track_analytics_events:398 | Tracked analytics event codex_thread_initialized thread_id=019d3bf7-9f5f-7f82-9877-6d48d1052531 product_surface=codex product_client_id=CODEX_CLI client_name=codex-tui client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=new subagent_source=None parent_thread_id=None created_at=1774827577 | 
INFO     | 2026-03-29 16:45:46 | codex_backend.routers.analytics_events | analytics_events.track_analytics_events:398 | Tracked analytics event codex_thread_initialized thread_id=019d3b84-5731-79d0-9b3b-9c6efe5f5066 product_surface=codex product_client_id=CODEX_CLI client_name=codex-tui client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=resumed subagent_source=None parent_thread_id=None created_at=1774820022 | 
INFO     | 2026-03-29 16:45:49 | codex_backend.routers.analytics_events | analytics_events.track_analytics_events:398 | Tracked analytics event codex_thread_initialized thread_id=019d3bfd-4cd6-7c12-a13e-48cef02e8c4d product_surface=codex product_client_id=CODEX_CLI client_name=codex-tui client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=forked subagent_source=None parent_thread_id=None created_at=1774827949 | 
INFO     | 2026-03-29 17:20:29 | codex_backend.routers.analytics_events | analytics_events.track_analytics_events:398 | Tracked analytics event codex_thread_initialized thread_id=019d3c1d-0412-7ed2-ad24-c9c0881a36b0 product_surface=codex product_client_id=CODEX_SERVICE_EXEC client_name=codex_exec client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=new subagent_source=None parent_thread_id=None created_at=1774830027 | 
```

Notes
- `product_client_id` gets canonicalized in codex-backend
- subagent threads are addressed in a following pr
## Summary
- prioritize newly surfaced review comments ahead of CI and mergeability
handling in the PR babysitter watcher
- keep `--watch` running for open PRs even when they are currently
merge-ready so later review feedback is not missed
## Summary
- Replace the separate external auth enum and refresher trait with a
single `ExternalAuth` trait in login auth flow
- Move bearer token auth behind `BearerTokenRefresher` and update
`AuthManager` and app-server wiring to use the generic external auth API
## Why

#16287 introduced a change to
`codex-rs/login/src/auth/auth_tests.rs` that uses a PowerShell helper to
read the next token from `tokens.txt` and rewrite the remainder back to
disk. On Windows, `Get-Content` can return a scalar when the file has
only one remaining line, so `$lines[0]` reads the first character
instead of the full token. That breaks the external bearer refresh test
once the token list is nearly exhausted.

#16288 introduced similar changes to
`codex-rs/core/src/models_manager/manager_tests.rs` and
`codex-rs/core/tests/suite/client.rs`.

These went unnoticed because the failures showed up when the test was
run via Cargo on Windows, but not in our Bazel harness. Figuring out
that Cargo-vs-Bazel delta will happen in a follow-up PR.

## Verification

On my Windows machine, I verified `cargo test` passes when run in
`codex-rs/login` and `codex-rs/core`. Once this PR is merged, I will
keep an eye on
https://github.com/openai/codex/actions/workflows/rust-ci-full.yml to
verify it goes green.

## What changed

- Wrap `Get-Content -Path tokens.txt` in `@(...)` so the script always
gets array semantics before counting, indexing, and rewriting the
remaining lines.
## Why

Bazel clippy now catches lints that `cargo clippy` can still miss when a
crate under `codex-rs` forgets to opt into workspace lints. The concrete
example here was `codex-rs/app-server/tests/common/Cargo.toml`: Bazel
flagged a clippy violation in `models_cache.rs`, but Cargo did not
because that crate inherited workspace package metadata without
declaring `[lints] workspace = true`.

We already mirror the workspace clippy deny list into Bazel after
[#15955](#15955), so we also need a
repo-side check that keeps every `codex-rs` manifest opted into the same
workspace settings.

## What changed

- add `.github/scripts/verify_cargo_workspace_manifests.py`, which
parses every `codex-rs/**/Cargo.toml` with `tomllib` and verifies:
  - `version.workspace = true`
  - `edition.workspace = true`
  - `license.workspace = true`
  - `[lints] workspace = true`
- top-level crate names follow the `codex-*` / `codex-utils-*`
conventions, with explicit exceptions for `windows-sandbox-rs` and
`utils/path-utils`
- run that script in `.github/workflows/ci.yml`
- update the current outlier manifests so the check is enforceable
immediately
- fix the newly exposed clippy violations in the affected crates
(`app-server/tests/common`, `file-search`, `feedback`,
`shell-escalation`, and `debug-client`)






---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16353).
* #16351
* __->__ #16353
Problem: `chatwidget/tests.rs` had grown into a single oversized test
blob that was hard to maintain and exceeded the repo's blob size limit.

Solution: split the chatwidget tests into topical modules with a thin
root `tests.rs`, shared helper utilities, preserved snapshot naming, and
hermetic test config so the refactor stays stable and passes the
`codex-tui` test suite.
Fix stale weekly limit in `/status` (#16194): /status reused the
session’s cached rate-limit snapshot, so the weekly remaining limit
could stay frozen within an active session.

With this change, we now dynamically update the rate limits after status
is displayed.

I needed to delete a few low-value test cases from the chatWidget tests
because the test.rs file is really large, and the new tests in this PR
pushed us over the 512K mandated limit. I'm working on a separate PR to
refactor that test file.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

⤵️ pull merge-conflict Sync PR has merge conflicts

Projects

None yet

Development

Successfully merging this pull request may close these issues.