Tell users to update the CLI when the Genie API drifts or disappears#5570
Merged
Conversation
The experimental genie command calls an undocumented backend route that can move, be disabled, or change shape between Databricks releases. A removed endpoint now reports the situation and points at a CLI update instead of leaking a bare 'No API found' error, and the existing protocol-drift detection (unparsed events, streams that end without an answer) carries the same advice. Co-authored-by: Isaac
RESOURCE_DOES_NOT_EXIST also unwraps to apierr.ErrNotFound, and the request carries a user-supplied warehouseId: a pre-stream 404 about a missing warehouse must keep the backend's message instead of claiming the endpoint moved. A removed route maps to plain ErrNotFound (ENDPOINT_NOT_FOUND has no error-code mapping in the SDK), so excluding ErrResourceDoesNotExist keeps the drift advice for route-gone and code-less 404s only. Co-authored-by: Isaac
pietern
approved these changes
Jun 12, 2026
| // plain ErrNotFound). A 404 RESOURCE_DOES_NOT_EXIST is excluded: it refers | ||
| // to something the request named (e.g. the warehouse) and must keep the | ||
| // backend's own message instead of blaming the endpoint. | ||
| if errors.Is(err, apierr.ErrNotFound) && !errors.Is(err, apierr.ErrResourceDoesNotExist) { |
Contributor
There was a problem hiding this comment.
Any other errors, like 500s when the shape of the body is wrong?
Member
Author
There was a problem hiding this comment.
Good question. I probed the live endpoint to find out:
- Wrong-shaped body (missing
input): returns 500INTERNAL_ERRORwith an empty message, which rendered as a literal blankError:. Added handling in 0a2277f: 500s now carry the same hedged advice ("if this keeps happening, the request format may have changed..."), and the no-details case gets explicit wording since there is no server message to pass through. - Bad
warehouseIdwith a valid body: returns 200 and the failure arrives in-stream, so it is handled by the SSE error path and never hits the transport-level branches. - 400s are left alone: the server message passes through verbatim and I could not produce one from shape drift.
Response-shape drift (items the CLI cannot parse, or a stream with no answer) is covered separately by the renderer checks, which also point at a CLI update now.
Probing the live endpoint with a wrong-shaped body returns 500 INTERNAL_ERROR with an empty message, which rendered as a blank 'Error: '. Wrap 500s with the same hedged update advice; the no-details case gets its own wording since there is no server message to pass through. A bad warehouseId with a valid body returns 200 and fails in-stream (handled by the SSE error path), so it cannot be confused with this case. Co-authored-by: Isaac
Collaborator
Integration test reportCommit: 0a2277f
22 interesting tests: 15 SKIP, 7 KNOWN
Top 34 slowest tests (at least 2 minutes):
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
databricks experimental genie askis built on a public but undocumented backend route (/api/2.0/data-rooms/tools/onechat/responses). The route can move, be disabled, or change its wire format between Databricks releases without notice. When that happens, the user's only fix is updating the CLI, but nothing told them that:No API found for 'POST /data-rooms/tools/onechat/responses'.--raw.Changes
Before: a vanished endpoint or changed protocol produced errors with no recovery path. Now: every drift-shaped failure tells the user to update the CLI, via a single shared advice string ("update the Databricks CLI to the latest version (run 'databricks version --check')").
PostStreamdetects 404 witherrors.Is(err, apierr.ErrNotFound). The route is fixed and carries no resource IDs, so a 404 can only mean the endpoint itself is gone or disabled. The error keeps the SDK error chain (%w) and appends the advice. The 404 shape (ENDPOINT_NOT_FOUND, "No API found for ...") was verified against a live workspace gateway.noAnswerErrorhelper) and the unparsed-events warning include the same advice, alongside the existing--rawsuggestion.No NEXT_CHANGELOG entry since the command is experimental.
Test plan
errors.Is(err, apierr.ErrNotFound)still matches through the wrap.ask-endpoint-gone(404 from the stub server) andask-protocol-drift(syntactically valid stream with renamed item types, text + JSON modes, exit code 1)../task checks,./task lint-q,./task fmt-qall clean.This pull request and its description were written by Isaac.