Skip to content

Only show installable projects in 'databricks labs list'#5560

Open
janniklasrose wants to merge 2 commits into
mainfrom
janniklasrose/labs-list-installable
Open

Only show installable projects in 'databricks labs list'#5560
janniklasrose wants to merge 2 commits into
mainfrom
janniklasrose/labs-list-installable

Conversation

@janniklasrose

Copy link
Copy Markdown
Contributor

Changes

databricks labs list showed every non-archived, non-fork repository in the databrickslabs GitHub org — currently 39 — but only repositories that ship a labs.yml manifest can actually be installed (currently 8: blueprint, dlt-meta, dqx, lakebridge, lsql, pylint-plugin, sandbox, ucx). Picking anything else from the list failed databricks labs install with Error: remote: read labs.yml from GitHub: not found (error message improved separately in #5559), so the listing mostly advertised projects that cannot be installed.

Filter the listing to repositories that have a root labs.yml on their default branch:

  • The manifest existence check goes through raw.githubusercontent.com, which is not subject to the low unauthenticated GitHub API rate limit (checking the release tag instead would cost one API call per repository).
  • Checks run concurrently (errgroup, limit 10) and the filtered result is cached for 24 hours alongside the existing repository-list cache, so repeated invocations make no HTTP requests.

Known approximation: a repository with labs.yml on its default branch but no matching release tag still lists but fails install — with the clearer error from #5559.

Output before (39 entries, abridged) / after (8 entries):

Name           Description
blueprint      Baseline for Databricks Labs projects written in P...
dlt-meta       Metadata driven Spark Declarative Pipelines framew...
dqx            Databricks framework to validate Data Quality of p...
lakebridge     Accelerates migrations to Databricks by automating...
lsql           Lightweight SQL execution wrapper only on top of D...
pylint-plugin  Databricks Plugin for PyLint
sandbox        Experimental labs projects
ucx            Automated migrations to Unity Catalog

Tests

Extended cmd/labs/list_test.go: the existing test now renders from the new installable-repositories cache fixture, and a new test exercises the filter end-to-end against a stub GitHub server (repo with labs.yml is listed, repo returning 404 is not). Verified live; output above.

This pull request and its description were written by Isaac, an AI coding agent.

'databricks labs list' showed every non-archived, non-fork repository in the databrickslabs GitHub org (currently 39), but only repositories that ship a labs.yml manifest at the root of their release tag can actually be installed (currently 8). Everything else failed 'databricks labs install' with a not-found error. Filter the listing to repositories that have a root labs.yml on their default branch, checked concurrently via raw.githubusercontent.com (not subject to the low unauthenticated GitHub API rate limit) and cached for 24 hours like the repository list itself.

Co-authored-by: Isaac
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Approval status: pending

/cmd/labs/ - needs approval

Files: cmd/labs/list.go, cmd/labs/list_test.go, cmd/labs/project/testdata/installed-in-home/.databricks/labs/databrickslabs-installable-repositories.json
Suggested: @asnare
Also eligible: @alexott

General files (require maintainer)

Files: NEXT_CHANGELOG.md
Based on git history:

  • @pietern -- recent work in cmd/labs/, ./

Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @shreyas-goenka, @simonfaltum, @renaudhartert-db) can approve all areas.
See OWNERS for ownership rules.

Co-authored-by: Isaac
@eng-dev-ecosystem-bot

Copy link
Copy Markdown
Collaborator

Commit: 8ed1722

Run: 27369574934

Env 🟨​KNOWN 🔄​flaky 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
🟨​ aws linux 7 15 264 969 7:53
🟨​ aws windows 7 15 266 967 17:50
💚​ aws-ucws linux 7 15 360 883 8:20
💚​ aws-ucws windows 7 15 362 881 12:09
💚​ azure linux 1 17 267 967 7:25
💚​ azure windows 1 17 269 965 11:45
💚​ azure-ucws linux 1 17 365 879 7:38
🔄​ azure-ucws windows 1 1 17 366 877 13:04
🔄​ gcp linux 2 1 17 261 970 9:57
💚​ gcp windows 1 17 265 968 11:39
25 interesting tests: 15 SKIP, 7 KNOWN, 3 flaky
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🟨​ TestAccept 🟨​K 🟨​K 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/invariant/no_drift 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🙈​ TestAccept/bundle/resources/postgres_branches/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/replace_existing 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/update_protected 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/without_branch_id 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_projects/update_display_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/grants/select 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/ssh/connection 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🔄​ TestFetchRepositoryInfoAPI_FromRepo ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p
🔄​ TestFetchRepositoryInfoAPI_FromRepo/root ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p
🔄​ TestFetchRepositoryInfoAPI_FromRepo/subdir ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p
Top 28 slowest tests (at least 2 minutes):
duration env testname
6:17 azure windows TestAccept
6:04 azure-ucws windows TestAccept
6:03 aws-ucws windows TestAccept
6:01 gcp windows TestAccept
5:05 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:14 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:10 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:07 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:44 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:28 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:21 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:21 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:15 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:11 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:01 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:58 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:56 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:55 azure linux TestAccept
2:53 gcp linux TestAccept
2:48 azure-ucws linux TestAccept
2:48 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:47 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:46 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:45 aws-ucws linux TestAccept
2:44 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:43 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:38 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:34 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct

@simonfaltum simonfaltum left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the full diff plus the supporting packages (localcache, cmd/labs/github, clear_cache.go), and ran an independent second-model pass over the same diff; both converged on the same two issues, so requesting changes for those (details inline):

  1. labs clear-cache does not know about the new cache file.
  2. An offline cold start writes an empty installable cache that then sticks for 24h, and (1) means clear-cache cannot fix it.

Both fixes are small. Two smaller notes inline (changelog wording given #5559 is still open, and a test nit).

Checked and found sound: the errgroup filter (writes to distinct slice elements, first-error semantics, limit 10, ctx propagation), preserved ordering and archived/fork semantics, graceful offline behavior when caches exist, the raw.githubusercontent choice and its failure mode (failing loudly beats caching a partial list for 24h), no stale-cache hazard on default_branch (it has been in ghRepo since #914, so old on-disk caches have it), and the test design (the blueprint fixture proof in TestListingWorks is a nice touch). Unit tests for cmd/labs, cmd/labs/github, and cmd/labs/localcache pass locally, including a -race run of the new test.

This review was written by Isaac, an AI coding agent, with an independent second pass by another model.

Comment thread cmd/labs/list.go
if err != nil {
return nil, err
}
cache := localcache.NewLocalCache[github.Repositories](cacheDir, labsOrg+"-installable-repositories", installableCacheTTL)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labs clear-cache (cmd/labs/clear_cache.go:22) only removes databrickslabs-repositories.json and the per-project caches, so this new cache file survives the command whose help says it clears "everywhere relevant". The moment a user reaches for clear-cache under this feature is when a project just added a labs.yml and they want it to show up, and this is exactly the file that needs purging.

clear_cache.go is in the same package: suggest extracting the cache name (labsOrg + "-installable-repositories") into a const next to installableCacheTTL, reusing it in both places, and adding a second os.Remove. There is no clear-cache test today; worth adding one that asserts both org-level cache files are gone.

Comment thread cmd/labs/list.go
Comment on lines +46 to +52
return cache.Load(ctx, func() (github.Repositories, error) {
repos, err := allRepos(ctx)
if err != nil {
return nil, err
}
return filterInstallable(ctx, repos)
})

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An offline cold start caches an empty list for 24 hours:

  1. With no repos cache on disk and the network down, the inner repos cache hits refreshCache, the fetch fails with a *url.Error, and the offline branch (localcache/jsonfile.go:59-62) returns the zero value with no error and no cache write. For github.Repositories that zero value is nil.
  2. filterInstallable(ctx, nil) iterates zero repos and returns (nil, nil), a success.
  3. This outer cache sees a successful refresh and writes the empty list with a fresh timestamp (jsonfile.go:66).

Result: labs list renders an empty table for the next 24h even after connectivity returns. Pre-PR, the same scenario rendered an empty table once and cached nothing. The clear-cache gap flagged in the other comment compounds this, since the documented remedy will not purge the file.

Suggested fix: hoist allRepos out of the cache.Load closure and return early when it comes back empty, so nothing is written. That preserves the current offline UX (empty table, exit 0); the cost is consulting the repos cache freshness on every list run, which is a disk read while it is fresh. Returning an error from the closure when len(repos) == 0 also works but turns the offline cold start into a failure. Either way, a test with no caches on disk plus an unreachable server, asserting that no databrickslabs-installable-repositories.json is created, would lock the behavior in.

Comment thread NEXT_CHANGELOG.md

### CLI
* Show a once-per-day notice after a command when a newer CLI release is available, with a link to the release and the upgrade command for the detected install method. Suppressed for non-interactive/CI runs, JSON output, the Databricks Runtime, and development builds, and can be disabled with `DATABRICKS_CLI_DISABLE_UPDATE_CHECK` ([#5470](https://github.com/databricks/cli/pull/5470)).
* `databricks labs list` now only shows projects that can be installed (those shipping a `labs.yml` manifest), and `databricks labs install` explains when a project does not provide one instead of failing with a generic "not found" error ([#5559](https://github.com/databricks/cli/pull/5559), [#5560](https://github.com/databricks/cli/pull/5560)).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bullet also promises the improved labs install error, which is #5559, still open. If this PR merges and a release cuts before #5559 lands, the changelog overpromises. Suggest trimming the bullet to the labs list half and letting #5559 carry its own line when it merges (or coordinating the merge order).

Comment thread cmd/labs/list_test.go
w.WriteHeader(http.StatusNotFound)
default:
t.Logf("Requested: %s", r.URL.Path)
t.FailNow()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: t.FailNow() from the handler goroutine is technically illegal (the testing docs require it on the goroutine running the test; here it Goexits the connection goroutine). It does mark the test failed in practice, and installer_test.go uses the same catch-all idiom, so fine to leave for consistency. If you touch it anyway, t.Errorf plus http.Error(w, ..., http.StatusInternalServerError) is the correct shape.

@asnare asnare left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that labs list should only list installable projects, and considered the problem back when I addressed the paging issue.

The problem with testing for labs.yml is the additional REST calls that are required:

  • Before this PR: 1 request (results cached for 24h)
  • After this request: 1 + N requests, where N is currently 39.

Although this implementation avoids the 60/IP/hour quota on the REST API by hitting the CDN directly, in terms of light-touch I think I'd prefer to filter projects based on repository "topics". I've tagged a few with databricks-cli-installable, for testing purposes. Filtering on this has a few benefits:

  • No additional HTTP requests necessary, repository topics are already included in the response we get.
  • Caching remains simple.
  • On the labs/maintainer side, things become opt-in. At 8/39 I think opt-in is preferable to opt-out.
  • On the labs/maintainer side, turning up on labs list becomes an admin operation.

Before reviewing the technical implementation I'd like to get consensus on this.

P.S. I also rejected using the GraphQL API as a solution to detect the presence of labs.yml: calls need to be authenticated, and the quota system would still make it costly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants