Only show installable projects in 'databricks labs list'#5560
Only show installable projects in 'databricks labs list'#5560janniklasrose wants to merge 2 commits into
Conversation
'databricks labs list' showed every non-archived, non-fork repository in the databrickslabs GitHub org (currently 39), but only repositories that ship a labs.yml manifest at the root of their release tag can actually be installed (currently 8). Everything else failed 'databricks labs install' with a not-found error. Filter the listing to repositories that have a root labs.yml on their default branch, checked concurrently via raw.githubusercontent.com (not subject to the low unauthenticated GitHub API rate limit) and cached for 24 hours like the repository list itself. Co-authored-by: Isaac
Approval status: pending
|
Co-authored-by: Isaac
|
Commit: 8ed1722
25 interesting tests: 15 SKIP, 7 KNOWN, 3 flaky
Top 28 slowest tests (at least 2 minutes):
|
simonfaltum
left a comment
There was a problem hiding this comment.
Reviewed the full diff plus the supporting packages (localcache, cmd/labs/github, clear_cache.go), and ran an independent second-model pass over the same diff; both converged on the same two issues, so requesting changes for those (details inline):
labs clear-cachedoes not know about the new cache file.- An offline cold start writes an empty installable cache that then sticks for 24h, and (1) means
clear-cachecannot fix it.
Both fixes are small. Two smaller notes inline (changelog wording given #5559 is still open, and a test nit).
Checked and found sound: the errgroup filter (writes to distinct slice elements, first-error semantics, limit 10, ctx propagation), preserved ordering and archived/fork semantics, graceful offline behavior when caches exist, the raw.githubusercontent choice and its failure mode (failing loudly beats caching a partial list for 24h), no stale-cache hazard on default_branch (it has been in ghRepo since #914, so old on-disk caches have it), and the test design (the blueprint fixture proof in TestListingWorks is a nice touch). Unit tests for cmd/labs, cmd/labs/github, and cmd/labs/localcache pass locally, including a -race run of the new test.
This review was written by Isaac, an AI coding agent, with an independent second pass by another model.
| if err != nil { | ||
| return nil, err | ||
| } | ||
| cache := localcache.NewLocalCache[github.Repositories](cacheDir, labsOrg+"-installable-repositories", installableCacheTTL) |
There was a problem hiding this comment.
labs clear-cache (cmd/labs/clear_cache.go:22) only removes databrickslabs-repositories.json and the per-project caches, so this new cache file survives the command whose help says it clears "everywhere relevant". The moment a user reaches for clear-cache under this feature is when a project just added a labs.yml and they want it to show up, and this is exactly the file that needs purging.
clear_cache.go is in the same package: suggest extracting the cache name (labsOrg + "-installable-repositories") into a const next to installableCacheTTL, reusing it in both places, and adding a second os.Remove. There is no clear-cache test today; worth adding one that asserts both org-level cache files are gone.
| return cache.Load(ctx, func() (github.Repositories, error) { | ||
| repos, err := allRepos(ctx) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| return filterInstallable(ctx, repos) | ||
| }) |
There was a problem hiding this comment.
An offline cold start caches an empty list for 24 hours:
- With no repos cache on disk and the network down, the inner repos cache hits
refreshCache, the fetch fails with a*url.Error, and the offline branch (localcache/jsonfile.go:59-62) returns the zero value with no error and no cache write. Forgithub.Repositoriesthat zero value is nil. filterInstallable(ctx, nil)iterates zero repos and returns(nil, nil), a success.- This outer cache sees a successful refresh and writes the empty list with a fresh timestamp (
jsonfile.go:66).
Result: labs list renders an empty table for the next 24h even after connectivity returns. Pre-PR, the same scenario rendered an empty table once and cached nothing. The clear-cache gap flagged in the other comment compounds this, since the documented remedy will not purge the file.
Suggested fix: hoist allRepos out of the cache.Load closure and return early when it comes back empty, so nothing is written. That preserves the current offline UX (empty table, exit 0); the cost is consulting the repos cache freshness on every list run, which is a disk read while it is fresh. Returning an error from the closure when len(repos) == 0 also works but turns the offline cold start into a failure. Either way, a test with no caches on disk plus an unreachable server, asserting that no databrickslabs-installable-repositories.json is created, would lock the behavior in.
|
|
||
| ### CLI | ||
| * Show a once-per-day notice after a command when a newer CLI release is available, with a link to the release and the upgrade command for the detected install method. Suppressed for non-interactive/CI runs, JSON output, the Databricks Runtime, and development builds, and can be disabled with `DATABRICKS_CLI_DISABLE_UPDATE_CHECK` ([#5470](https://github.com/databricks/cli/pull/5470)). | ||
| * `databricks labs list` now only shows projects that can be installed (those shipping a `labs.yml` manifest), and `databricks labs install` explains when a project does not provide one instead of failing with a generic "not found" error ([#5559](https://github.com/databricks/cli/pull/5559), [#5560](https://github.com/databricks/cli/pull/5560)). |
There was a problem hiding this comment.
This bullet also promises the improved labs install error, which is #5559, still open. If this PR merges and a release cuts before #5559 lands, the changelog overpromises. Suggest trimming the bullet to the labs list half and letting #5559 carry its own line when it merges (or coordinating the merge order).
| w.WriteHeader(http.StatusNotFound) | ||
| default: | ||
| t.Logf("Requested: %s", r.URL.Path) | ||
| t.FailNow() |
There was a problem hiding this comment.
nit: t.FailNow() from the handler goroutine is technically illegal (the testing docs require it on the goroutine running the test; here it Goexits the connection goroutine). It does mark the test failed in practice, and installer_test.go uses the same catch-all idiom, so fine to leave for consistency. If you touch it anyway, t.Errorf plus http.Error(w, ..., http.StatusInternalServerError) is the correct shape.
There was a problem hiding this comment.
I agree that labs list should only list installable projects, and considered the problem back when I addressed the paging issue.
The problem with testing for labs.yml is the additional REST calls that are required:
- Before this PR: 1 request (results cached for 24h)
- After this request: 1 + N requests, where N is currently 39.
Although this implementation avoids the 60/IP/hour quota on the REST API by hitting the CDN directly, in terms of light-touch I think I'd prefer to filter projects based on repository "topics". I've tagged a few with databricks-cli-installable, for testing purposes. Filtering on this has a few benefits:
- No additional HTTP requests necessary, repository topics are already included in the response we get.
- Caching remains simple.
- On the labs/maintainer side, things become opt-in. At 8/39 I think opt-in is preferable to opt-out.
- On the labs/maintainer side, turning up on
labs listbecomes an admin operation.
Before reviewing the technical implementation I'd like to get consensus on this.
P.S. I also rejected using the GraphQL API as a solution to detect the presence of labs.yml: calls need to be authenticated, and the quota system would still make it costly.
Changes
databricks labs listshowed every non-archived, non-fork repository in the databrickslabs GitHub org — currently 39 — but only repositories that ship alabs.ymlmanifest can actually be installed (currently 8: blueprint, dlt-meta, dqx, lakebridge, lsql, pylint-plugin, sandbox, ucx). Picking anything else from the list faileddatabricks labs installwithError: remote: read labs.yml from GitHub: not found(error message improved separately in #5559), so the listing mostly advertised projects that cannot be installed.Filter the listing to repositories that have a root
labs.ymlon their default branch:Known approximation: a repository with
labs.ymlon its default branch but no matching release tag still lists but fails install — with the clearer error from #5559.Output before (39 entries, abridged) / after (8 entries):
Tests
Extended
cmd/labs/list_test.go: the existing test now renders from the new installable-repositories cache fixture, and a new test exercises the filter end-to-end against a stub GitHub server (repo withlabs.ymlis listed, repo returning 404 is not). Verified live; output above.This pull request and its description were written by Isaac, an AI coding agent.