fix(publisher): treat GitHub device-flow slow_down as retriable#1290
fix(publisher): treat GitHub device-flow slow_down as retriable#1290developer-ishan wants to merge 4 commits into
slow_down as retriable#1290Conversation
GitHub's device-flow token endpoint returns slow_down when the client polls too frequently. Per RFC 8628 §3.5 this is a non-terminal signal: the client must increase its polling interval by 5 seconds and keep polling. The current pollForToken implementation only handles authorization_pending as retriable and bails on every other error, turning a routine rate-limit response into an unrecoverable "login failed: error polling for token: token request failed: slow_down". Treat slow_down the same as authorization_pending, bumping the interval per the RFC. Fixes modelcontextprotocol#1289
|
Hi @developer-ishan, nice catch on this one, the fix is textually correct and strictly RFC 8628 §3.5 compliant. Two small suggestions before a maintainer reviews: 1. Add a regression test
Both are deterministic with a fake clock or by passing a tiny initial 2. Optional: cap the interval RFC 8628 §3.5 says only "the interval MUST be increased by 5 seconds", but it leaves the maximum to the implementation. If a misbehaving auth server returns Other than that, the patch is minimal, correct, and reads exactly like what the spec asks for. 👍 Thanks for the careful RFC reference in the issue, it made this PR very easy to review. |
|
@P4ST4S good suggestions, let me check and test both these cases. |
Address review feedback on the device-flow slow_down fix:
- Add regression tests for pollForToken (previously untested):
- slow_down then success: confirms polling continues to success
- slow_down / authorization_pending / slow_down then success: confirms
the RFC 8628 §3.5 interval increment compounds across mixed retryable
errors and is not reset between slow_down instances
- repeated slow_down: confirms the interval is capped
- non-retryable error: confirms polling stops immediately
- Cap the polling interval at 60s so a misbehaving auth server returning
slow_down repeatedly cannot grow it unboundedly (RFC 8628 §3.5 mandates
the +5s increase but leaves the maximum to the implementation).
To make pollForToken testable, the access-token URL, initial interval, and
sleep function are now provider fields wired to defaults in the constructor;
tests inject a mock server and capture the back-off sequence deterministically.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thanks for the thorough review, @P4ST4S! Both points are addressed in 9c5e9fc. 1. Regression tests —
2. Interval cap — added a 60s ceiling so a misbehaving auth server returning
|
|
This is an excellent turnaround. A few specific things that stood out: Capturing the sleep sequence rather than just zeroing it out is the right call, the tests now verify the correctness of the back-off (5 → 5 → 10) rather than just that the loop terminates. That's the test that would catch a future regression where someone "simplifies" the logic by resetting the interval between
The 60s cap test pins both of the last two back-offs at the ceiling, which is tighter than just
Minor, non-blocking observation: RFC 8628 §3.5 lists five terminal error codes ( LGTM from an external reviewer's perspective. Thanks for the careful work and the validation against |
Summary
slow_downresponse per RFC 8628 §3.5 — treat it as retriable and bump the polling interval by 5 seconds, instead of bailing with a fatal error.Fixes #1289.
Why
pollForTokenincmd/publisher/auth/github-at.goonly treatsauthorization_pendingas retriable. Every other response from GitHub's token endpoint — including the non-terminalslow_downsignal — falls through to:So users hitting GitHub's polling rate-limit get:
…and have to re-run
mcp-publisher login githubfrom scratch, which is especially painful becauseslow_downis more likely when the user takes more than a few seconds to enter and authorize the device code.Per RFC 8628 §3.5,
slow_downis defined as:Change
Treat
slow_downthe same asauthorization_pendingbut increaseintervalby 5 seconds before sleeping, matching the RFC exactly.A possible follow-up (not in this PR, to keep the diff minimal): seed the initial
intervalfromDeviceCodeResponse.Interval(already parsed but currently unused) instead of hardcoding5. Happy to do that here or as a separate PR if preferred.Test plan
go build ./cmd/publisher/...— cleango test ./cmd/publisher/auth/...— passesgofmt -dandgo vet ./cmd/publisher/auth/...— cleanmake check(lint + integration tests)I could not exercise
pollForTokenfrom a unit test without refactoringGitHubAccessTokenURLto be injectable, which felt out of scope for a bug-fix PR. Glad to add that refactor + a table-driven test forauthorization_pending/slow_down/ fatal errors in a follow-up if maintainers want it.