py_binary venvs: analysis-time assembly + shared external_venv (v2.0.0)#944
py_binary venvs: analysis-time assembly + shared external_venv (v2.0.0)#944gregmagolan wants to merge 2 commits intomainfrom
external_venv (v2.0.0)#944Conversation
5de72ac to
cc215c4
Compare
|
51aff4f to
6a61c17
Compare
5cdfc60 to
e6249ef
Compare
e6249ef to
959ece0
Compare
…stead of during py_binary bootstrap phase
93b5cd2 to
d39e5fb
Compare
|
Put up bazel-contrib/tar.bzl#106 |
|
Put up bazel-contrib/tar.bzl#107 |
ab86fe2 to
efceb7c
Compare
external_venv support
efceb7c to
50a2859
Compare
external_venv supportexternal_venv (v2.0.0)
5a6bc37 to
6adb52b
Compare
b943ae3 to
7bbca29
Compare
|
|
||
| def _unpack_toolchain_path_impl(ctx): | ||
| unpack_bin = ctx.toolchains[UNPACK_TOOLCHAIN].bin.bin | ||
| unpack = ctx.toolchains[UNPACK_TOOLCHAIN].bin.bin |
There was a problem hiding this comment.
intentional name change
022055c to
5993a4d
Compare
| if ctx.attr.main: | ||
| if not ctx.attr.main.label.name.endswith(".py"): | ||
| fail("main must end in '.py'") | ||
| # Check the RESOLVED file name, not the label name — the label may |
There was a problem hiding this comment.
Is this (+ a test) something that can be merged now? Ask claude to cherry-pick this change + a test...
3658c37 to
7a803f5
Compare
7a803f5 to
fd20423
Compare
| # Launcher for py_venv targets — `bazel run :name` activates the venv | ||
| # and exec's its bin/python interactively. | ||
| # | ||
| # (py_venv_binary / py_venv_test don't use this template — they expand |
There was a problem hiding this comment.
This sentence seems irrelevant here.
| fi | ||
|
|
||
| {{BASH_RLOCATION_FN}} | ||
|
|
There was a problem hiding this comment.
revert so you can still read the code that gets generated 🙏
| ] | ||
|
|
||
| def _check_venv_coverage(ctx, imports_depset, wheels_depset, vinfo): | ||
| """Analysis-time check: everything the binary needs must live in the venv. |
There was a problem hiding this comment.
Did you think about making this a validation action? Or making it optional?
IDK if it's worth changing it from validation-in-starlark to validation-action, but it would allow avoiding the .to_list() calls I think?
|
|
||
| default_info_files = [executable_launcher, main] | ||
| if not external_venv: | ||
| default_info_files = default_info_files + extra_runfiles |
There was a problem hiding this comment.
extra_runfiles got added to DefaultInfo in addition to runfiles? Normally I think we'd want to (a) keep them separate (b) not duplicate them?
| # `load("@aspect_rules_py//py:defs.bzl", "py_venv_binary")` resolves | ||
| # and the call-site failure surfaces the friendly message (instead of | ||
| # Bazel's generic "symbol not exported" error). | ||
| py_venv_binary = _py_venv_binary |
There was a problem hiding this comment.
I think the deprecated ones should remain where they were. There's no reason to tell people to migrate from py/unsafe/defs.bzl to py/defs.bzl just to be told it's unsupported


What
Overhauls how
py_binary/py_testacquire virtualenvs and consolidates the API surface. Ships as v2.0.0.Legacy
py_binarystaged its site-packages tree at launcher startup: every invocation ran a Rustvenv_bintool that walked the runfiles, created symlinks under a temp venv dir, and then exec'd Python against it. On small graphs that's a few hundred ms of overhead; on large monorepos with thousands of wheels we measured 10s of seconds perbazel run/bazel test— on every invocation, not amortised by caching. Legacypy_venv_binaryavoided that cost by building a real on-disk venv as a Bazeldeclare_directorytree artifact, but paid for it with tree-artifact remote-execution / remote-cache penalties and a different launcher shape that didn't integrate withpy_image_layeror the rest of thepy_binarysurface.This PR unifies both onto a single model: the venv is a tree of individual Bazel outputs (
ctx.actions.symlink+ctx.actions.writeoutputs, not a tree artifact) produced at analysis time. The launcher is a no-op by comparison — Bazel has already placed the venv files in runfiles — so startup is milliseconds, not seconds, regardless of graph size. On top of that,py_binary(external_venv = :shared_venv)lets many binaries share a singlepy_venvtarget, andpy_binary(expose_venv = True)auto-emits a first-class sibling:<name>.venvpy_venv per target so the shared-venv pattern is available per-callsite without a separate declaration.Why
Two user-visible wins:
Faster startup. The launcher no longer runs a Rust binary to materialise the venv before exec'ing Python. Improvement scales with wheel count — biggest impact on large monorepos, where launcher cost was dominant.
Shared virtualenvs. The new
external_venvattribute lets manypy_binary/py_testtargets share a singlepy_venv. Point your IDE at one venv and every entrypoint in the repo resolves to the same interpreter + dep closure. An analysis-time coverage check rejects binaries whose dep closure isn't a subset of the venv's, with a clear error naming the missing wheels.A few smaller ergonomic improvements fell out:
env = {…}andenv_inherit = […]declared on apy_venvflow through topy_binary(external_venv = …)consumers; binary-levelenvwins on key conflicts.isolated = Falseonpy_binaryopts out of Python's-Iflag for code that needsPYTHONPATHor script-dir-on-sys.path semantics (matches the legacypy_venv_binaryshape).py_image_layerproduces valid container tars in all configurations we test (container test coverage turned out to have been latently-broken before this work).@aspect_rules_py//py:defs.bzl,//py:extensions.bzl,//uv:defs.bzl,//uv:extensions.bzl. The/unstable/paths graduated.Shape of the change
py_binary/py_testgainexpose_venv. Whenexpose_venv = Trueis passed, the macro splits into a first-class sibling:<name>.venvpy_venvtarget carrying all venv-shaping attrs (deps,imports,package_collisions,include_*_site_packages,interpreter_options) plus the binary/test rule withexternal_venv = ":<name>.venv". The.venvsibling is consumable (other targets canexternal_venv = "//:<name>.venv") and runnable (bazel run :<name>.venvdrops into the hermetic interpreter).expose_venvdefaults toFalseso defaultpy_binarycallers emit exactly one target — no graph bloat.Auto-emitted
<name>.venvsibling removed. In v1.x thepy_binarymacro unconditionally emitted a sibling:<name>.venvlink target. That doubled the node count inbazel query :*for every caller and forced the internal venv to awkwardly avoid the<name>.venvname. v2.0.0 drops the auto-emit; users who want the IDE-materialise-to-workspace behaviour declarepy_venv_link(name = ..., venv = ":<name>.venv")explicitly (and must also passexpose_venv = Trueon the binary to get a sibling venv to point at).py_venv_linkis now an explicit opt-in rule. Its signature changed: takesvenv = <label>pointing at an existingpy_venv(typically the one fromexpose_venv = True), produces a runnable target whosebazel runmaterialises a workspace-local symlink to the venv tree.py_venv_binary/py_venv_testwere removed. Calling them nowfail()s at analysis with a clear migration message. Replacement ispy_binary/py_testwithexpose_venv = True, isolated = False. The legacy deprecation-warning path (print-and-delegate) is gone; the failure is the one-and-only migration signal./unstable/load paths were removed. Everything previously at//py/unstable:{defs,extension}.bzland//uv/unstable:{defs,extension}.bzlmoved to the stable//py:defs.bzl,//py:extensions.bzl,//uv:defs.bzl,//uv:extensions.bzlpaths. Loading the old pathsfail()s with a message naming the replacement.Dropped Rust tooling. The
venv_binandvenv_shimcrates and their toolchain types (VENV_TOOLCHAIN,VENV_EXEC_TOOLCHAIN,SHIM_TOOLCHAIN) are gone — their jobs (runtime venv assembly, interpreter-indirection shim) are now done in Starlark. Onlyunpackremains, collapsed into a single crate underpy/tools/unpack/. A future shell-out to theuvCLI could remove that too once the uv toolchain lands.Two-hop symlinks for containers. The venv's per-top-level
site-packages/<name>symlinks were previously calibrated for the runfiles layout only, which brokepy_image_layer's bazel-bin tar walk. They now route through a<venv>/_wheels/<i>/directory alias whose relative targets resolve identically from bazel-bin, runfiles, and inside an OCI container.New test coverage. Four regression tests added for pytest-integration machinery that previously relied only on downstream usage to validate: pytest-xdist parallel-worker execution,
py_pytest_main(chdir = ...)template substitution,COVERAGE_MANIFEST/ LCOV post-processing inpytest_main.py, and multiplepy_binary+py_testtargets sharing a singleexternal_venvpy_venv.tar.bzl upstream fixes
Building
py_image_layeron real-world inputs surfaced two bugs in bazel-contrib/tar.bzl'spreserve_symlinks.awk:make_relative_linkoff-by-one. Emits N../s for a path of N segments; should be N-1. Produces dangling symlinks whose traversal escapes the archive.bin_dirand missed targets under transitioned configs (-ST-<hash>suffix) and underexternal/<repo>/. Also flipsreadlinkorder so authoreddeclare_symlink + target_pathstrings land in the archive verbatim.Both fixes are carried locally in
py/private/modify_mtree.awkvia a small custommtree_preserve_symlinksrule. Once the upstream PRs merge and we bumptar.bzl, the local fork retires andpy_image_layerreverts tomtree_mutate(preserve_symlinks = True).Migration guide
For a walkthrough with before/after examples and a troubleshooting section:
docs/migrating_v1_v2.md.Breaking changes (v2.0.0)
All breaking changes surface with a clear error at analysis time;
bazel test //...after the bump names every callsite that needs updating.Removed macros —
py_venv_binary/py_venv_testCalling either macro now
fail()s. Replacement is plainpy_binary/py_testwithexpose_venv = True, isolated = False. Plainpy_binary(...)with no extra attrs is the common case (analysis-time venv assembly is the default); the two extra attrs exist for callers who need the old split-target shape plus PYTHONPATH-honoring launcher.Removed load paths —
/unstable///py/unstable:{defs,extension}.bzland//uv/unstable:{defs,extension}.bzlare gone. Each nowfail()s at load time with a message pointing at the stable replacement.Removed auto-emitted sibling —
:<name>.venvno longer shows up for freeIn v1.x, every
py_binary(name = "foo")call auto-generated a:foo.venvlink target. v2.0.0 drops the auto-emit. Callers who ranbazel run :foo.venvfor IDE integration must either:expose_venv = Trueon the binary to get a consumable + runnable:foo.venvpy_venv, then declarepy_venv_linkexplicitly if they also need the workspace-materialise behaviour, orpy_venvdeclared separately, or at bazel-bin directly).Changed signature —
py_venv_linkWas: takes
deps/srcs/importsand builds its own internal venv.Now: takes
venv = <label>pointing at an existingpy_venvtarget. Pair withpy_binary(expose_venv = True, ...)or a standalonepy_venvdeclaration.Removed Rust toolchain types
VENV_TOOLCHAIN,VENV_EXEC_TOOLCHAIN, andSHIM_TOOLCHAINare gone — their work (runtime venv staging, interpreter-indirection shim) is now done in Starlark at analysis time. Users who registered overrides for these toolchain types must delete those registrations; leaving them in place produces toolchain-resolution errors.Changed venv internal layout — two-hop site-packages symlinks
The per-top-level
<venv>/lib/python<ver>/site-packages/<top>symlinks now route through an intermediate<venv>/_wheels/<i>/directory alias instead of pointing directly at the wheel's materialised tree. Python-level consumers (site.py, importlib, every normal user) see no difference. Filesystem-walking tools that iterate the venv tree and expect one-hop symlinks — custom tar rules, PEX-style packagers, docker build scripts that walksite-packages/— may need updating. The relative-link depth is the same in bazel-bin, runfiles, and inside an OCI image, so the new shape is strictly more portable than the old one.Unchanged (explicitly, for the skim-reader)
venv = "..."attribute (uv pip-extension wheel selection) is unchanged. The newexternal_venv = :labelis a separately-typed attribute that coexists with it.py_binarycallers who don't opt intoexternal_venv,expose_venv, orisolated = Falsesee no behavioural changes beyond the startup-performance win..<name>.venv/— same as v1.x. IDEs auto-detect the path unchanged.