Fixed a hole in token refresh logic for app server#11802
Fixed a hole in token refresh logic for app server#11802etraut-openai wants to merge 3 commits intomainfrom
Conversation
A previous change in token refresh for the agent loop introduced a UnauthorizedRecovery object. It implements a state machine that first performs a load of the on-disk auth information guarded by a check for matching account ID. If it finds that the on-disk version has been updated by another instance of codex, it uses the reloaded auth tokens. If it hasn't been updated, then it issues a refresh request from the authority that issued the original token. The problem is that we weren't doing the same thing for the code path used by the app server interface. This PR replicates this logic for that code path. I also took this opportunity to clean up the names of existing functions to make their roles clearer. * `try_refresh_token` is renamed `request_chatgpt_token_refresh` * the older `refresh_token` is renamed `refresh_token_from_authority` * `refresh_tokens` is renamed `refresh_and_persist_chatgpt_token`, and it now implicitly reloads * `update_tokens` is renamed `persist_tokens`
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ee2e85d379
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
codex/codex-rs/core/src/auth.rs
Lines 1054 to 1059 in 415c100
TokenData.account_id is optional, but this branch maps expected_account_id == None to Skipped. Both refresh_token() and UnauthorizedRecovery::next() convert Skipped into a permanent failure, so valid refresh tokens cannot be used for auth records lacking account_id (common in existing auth.json shapes), breaking 401 recovery.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
We've continued to receive reports from users that they're seeing the error message "Your access token could not be refreshed because your refresh token was already used. Please log out and sign in again." This PR fixes two holes in the token refresh logic that lead to this condition.
Background: A previous change in token refresh introduced the
UnauthorizedRecoveryobject. It implements a state machine in the core agent loop that first performs a load of the on-disk auth information guarded by a check for matching account ID. If it finds that the on-disk version has been updated by another instance of codex, it uses the reloaded auth tokens. If the on-disk version hasn't been updated, it issues a refresh request from the token authority.There are two problems that this PR addresses:
Problem 1: We weren't doing the same thing for the code path used by the app server interface. This PR effectively replicates the
UnauthorizedRecoverylogic for that code path.Problem 2: The
UnauthorizedRecoverylogic contained a hole in theReloadOutcome::Skippedcase. Here's the scenario. A user starts two instances of the CLI. Instance 1 is active (working on a task), instance 2 is idle. Both instances have the same in-memory cached tokens. The user then runscodex logoutorcodex loginto log in to a separate account, which overwrites theauth.jsonfile. Instance 1 receives a 401 and refreshes its token, but it doesn't write the new token to theauth.jsonfile because the account ID doesn't match. Instance 2 is later activated and presented with a new task. It immediately hits a 401 and attempts to refresh its token but fails because its cached refresh token is now invalid. To avoid this situation, I've changed the logic to immediately fail a token refresh if the user has since logged out or logged in to another account. This will still be seen as an error by the user, but the cause will be clearer.I also took this opportunity to clean up the names of existing functions to make their roles clearer.
try_refresh_tokenis renamedrequest_chatgpt_token_refreshrefresh_tokenis renamedrefresh_token_from_authority(there's a new higher-level function namedrefresh_tokennow)refresh_tokensis renamedrefresh_and_persist_chatgpt_token, and it now implicitly reloadsupdate_tokensis renamedpersist_tokens