Conversation
|
| seen_labels = {} | ||
| all_pkgs = [] | ||
| for pkg in info.pip_packages.to_list(): | ||
| if pkg.label not in seen_labels: | ||
| seen_labels[pkg.label] = True | ||
| all_pkgs.append(pkg) | ||
|
|
||
| pkg_by_label = {pkg.label: pkg for pkg in all_pkgs} |
There was a problem hiding this comment.
This code is doing a bit of double work. It manually filters for unique labels using a dictionary and a list, then immediately builds another dictionary at the end.
Since the final goal seems to be creating a dictionary mapped by labels (pkg_by_label), we can skip the intermediate list and the "seen" check entirely to reduce the memory usage.
Suggestion:
Option 1; (Shorter version, that keeps the last occurrence of each package)
| seen_labels = {} | |
| all_pkgs = [] | |
| for pkg in info.pip_packages.to_list(): | |
| if pkg.label not in seen_labels: | |
| seen_labels[pkg.label] = True | |
| all_pkgs.append(pkg) | |
| pkg_by_label = {pkg.label: pkg for pkg in all_pkgs} | |
| pkg_by_label = {pkg.label: pkg for pkg in info.pip_packages.to_list()} | |
| all_pkgs = pkg_by_label.values() |
Option 2: (Slightly modified version that keeps the first occurrence of each package)
| seen_labels = {} | |
| all_pkgs = [] | |
| for pkg in info.pip_packages.to_list(): | |
| if pkg.label not in seen_labels: | |
| seen_labels[pkg.label] = True | |
| all_pkgs.append(pkg) | |
| pkg_by_label = {pkg.label: pkg for pkg in all_pkgs} | |
| pkg_by_label = {} | |
| for pkg in info.pip_packages.to_list(): | |
| if pkg.label not in pkg_by_label: | |
| pkg_by_label[pkg.label] = pkg | |
| all_pkgs = pkg_by_label.values() | |
| `` |
There was a problem hiding this comment.
oh nice, i actually did not write this code, so probably there is a lot of optimizations here.
There was a problem hiding this comment.
This new version is really nice! It provides some great optimizations considering the previews version. These are just suggestions to improve even it further.
| all_tars.append(squashed_tar) | ||
| dep_tars.append(squashed_tar) | ||
|
|
||
| source_tar = _declare_group_tar( | ||
| ctx, bsdtar, bsdtar_files, | ||
| "{}_default.tar.gz".format(ctx.attr.name), | ||
| "default", | ||
| info.source_files, | ||
| _source_map, | ||
| "Creating source layer for %s" % ctx.label, | ||
| ) | ||
| all_tars.append(source_tar) |
There was a problem hiding this comment.
Suggestion: Eliminate Redundant Array Appends in _py_image_layer_impl
In _py_image_layer_impl, tar_out items are manually appended to both all_tars and dep_tars quite often.
Since dep_tars is essentially identical to all_tars except for the source_tar appended at the very end, you can entirely drop the dep_tars variable during iteration and simply clone the list at the end.
Proposed solution:
- Just append to all_tars normally inside your loops
- Before appending the source layer, clone
all_tarsto act asdep_tars
| all_tars.append(squashed_tar) | |
| dep_tars.append(squashed_tar) | |
| source_tar = _declare_group_tar( | |
| ctx, bsdtar, bsdtar_files, | |
| "{}_default.tar.gz".format(ctx.attr.name), | |
| "default", | |
| info.source_files, | |
| _source_map, | |
| "Creating source layer for %s" % ctx.label, | |
| ) | |
| all_tars.append(source_tar) | |
| dep_tars = list(all_tars) | |
| source_tar = _declare_group_tar( | |
| ctx, bsdtar, bsdtar_files, | |
| "{}_default.tar.gz".format(ctx.attr.name), | |
| "default", | |
| info.source_files, | |
| _source_map, | |
| "Creating source layer for %s" % ctx.label, | |
| ) | |
| all_tars.append(source_tar) |
| deps = ["//py/private/pytest_shard"], | ||
| ) | ||
|
|
||
| _rules_python_py_binary( |
There was a problem hiding this comment.
We need to move toward removing the rules_python dep. What is the reason for this?
| imports = [".."], | ||
| # This library contributes to the container test, testing we can pull in and use a library from another | ||
| # package in the repo. | ||
| visibility = [ |
|
|
||
| _TAR_TOOLCHAIN = "@tar.bzl//tar/toolchain:type" | ||
|
|
||
| _WHL_INSTALL_PREFIX = "aspect_rules_py++uv+whl_install__" |
There was a problem hiding this comment.
bazel 8 vs 9 have different separators (+ vs ~) so this will fail


Changes are visible to end-users: yes
TODO
Test plan
TODO