Skip to content

feat: add builtin uninstaller as alternative to calling pip#10904

Draft
radoering wants to merge 5 commits into
python-poetry:mainfrom
radoering:builtin-uninstaller
Draft

feat: add builtin uninstaller as alternative to calling pip#10904
radoering wants to merge 5 commits into
python-poetry:mainfrom
radoering:builtin-uninstaller

Conversation

@radoering

Copy link
Copy Markdown
Member

Pull Request Check List

  • Added tests for changed code.
  • Updated documentation for changed code.

Introduce a builtin uninstaller (adapted from pip's code) so that we do not have to run a pip subprocess for each uninstall. This makes uninstalls (and updates) faster. The new builtin uninstaller is not used by default. You have to set installer.builtin-uninstall to use it. (We can switch it later and remove the setting even later, just as we did with installer.modern-installation.)

The second commit makes use of the rollback mechanism in case of an error. Further, it fixes an issue in the legacy pip path where the return code of pip is not checked.

radoering added 2 commits May 17, 2026 11:57
- legacy pip uninstall: propagate pip return code in case of failed uninstalls
- new builtin uninstall: rollback uninstall in case of failed install

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues, and left some high level feedback:

  • The new UninstallPathSet/StashedUninstallPathSet logic relies on dist._path (a private attribute of importlib.metadata.Distribution); if possible, consider switching to a public API or adding a clear compatibility comment/guard around this to reduce the risk of breakage with future Python versions.
  • In StashedUninstallPathSet, _moves is documented as (old path, new path) but rollback() iterates as for new_path, path in self._moves, effectively swapping semantics; aligning the variable names and/or the comment with the actual ordering would make the rollback behavior easier to reason about and maintain.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new `UninstallPathSet`/`StashedUninstallPathSet` logic relies on `dist._path` (a private attribute of `importlib.metadata.Distribution`); if possible, consider switching to a public API or adding a clear compatibility comment/guard around this to reduce the risk of breakage with future Python versions.
- In `StashedUninstallPathSet`, `_moves` is documented as `(old path, new path)` but `rollback()` iterates as `for new_path, path in self._moves`, effectively swapping semantics; aligning the variable names and/or the comment with the actual ordering would make the rollback behavior easier to reason about and maintain.

## Individual Comments

### Comment 1
<location path="src/poetry/installation/uninstaller.py" line_range="209-217" />
<code_context>
+    def norm_join(*a: str) -> str:
+        return os.path.normcase(os.path.join(*a))
+
+    for root in unchecked:
+        if any(os.path.normcase(root).startswith(w) for w in wildcards):
+            # This directory has already been handled.
+            continue
+
+        all_files: set[str] = set()
+        all_subdirs: set[str] = set()
+        for dirname, subdirs, files in os.walk(root):
+            all_subdirs.update(norm_join(root, dirname, d) for d in subdirs)
+            all_files.update(norm_join(root, dirname, f) for f in files)
+        # If all the files we found are in our remaining set of files to
</code_context>
<issue_to_address>
**issue (bug_risk):** Path joining in compress_for_rename double-prefixes root, producing incorrect paths

In `compress_for_rename`, `os.walk(root)` already yields `dirname` as an absolute path under `root`. Using `norm_join(root, dirname, ...)` produces `root/root/...` paths that don’t exist and will break the `all_files - remaining` logic. The joins should use `dirname` only: `norm_join(dirname, d)` / `norm_join(dirname, f)`.
</issue_to_address>

### Comment 2
<location path="src/poetry/installation/uninstaller.py" line_range="184-186" />
<code_context>
+
+    sep = os.path.sep
+    short_paths: set[str] = set()
+    for path in sorted(paths, key=len):
+        should_skip = any(
+            path.startswith(shortpath.rstrip("*"))
+            and path[len(shortpath.rstrip("*").rstrip(sep))] == sep
+            for shortpath in short_paths
</code_context>
<issue_to_address>
**issue (bug_risk):** Possible IndexError and incorrect prefix handling in compact() short-path check

In `compact`, `path[len(shortpath.rstrip("*").rstrip(sep))]` can raise `IndexError` when `path` is equal to or shorter than the stripped `shortpath`, since there’s no `len(path) > len(prefix)` guard. Because this runs on filesystem paths during uninstall, that would abort uninstalls. Compute `prefix = shortpath.rstrip("*").rstrip(sep)` once, guard with `len(path) > len(prefix)`, and prefer something like `path.startswith(prefix + sep)` over direct index access.
</issue_to_address>

### Comment 3
<location path="tests/installation/test_uninstaller.py" line_range="11" />
<code_context>
+from poetry.installation.uninstaller import StashedUninstallPathSet
+from poetry.installation.uninstaller import UninstallPathSet
+from poetry.installation.uninstaller import _normalize_path
+from poetry.installation.uninstaller import _uninstallation_paths
+from poetry.installation.uninstaller import compact
+from poetry.installation.uninstaller import compress_for_rename
</code_context>
<issue_to_address>
**suggestion (testing):** Add a focused unit test for `_uninstallation_paths` to verify .pyc/.pyo sibling generation

`_uninstallation_paths` is only covered indirectly via `uninstall_distribution`. Please add a small, direct unit test that feeds a synthetic list of `PackagePath` objects (including at least one `.py`) and asserts that the yielded paths include both the `.py` entry and the corresponding `.pyc`/`.pyo`. This will make the helper’s behaviour explicit and protect against regressions if RECORD contents or `UninstallPathSet.add` change later.

Suggested implementation:

```python
import csv
from importlib.metadata import PackagePath

from pathlib import Path
from typing import TYPE_CHECKING

```

```python
def _make_env(tmp_path: Path) -> MockEnv:
    env_path = tmp_path / "env"


def test__uninstallation_paths_generates_bytecode_siblings(tmp_path: Path) -> None:
    """
    _uninstallation_paths should include .pyc/.pyo siblings for .py files.

    This uses a synthetic set of PackagePath entries (not a real distribution)
    to exercise the helper directly and make its behaviour explicit.
    """
    env = _make_env(tmp_path)
    # In this test we only care about the uninstallation path generation logic,
    # so we can treat the environment's site-packages directory as the root.
    site_packages = Path(env._path)  # Adjust if MockEnv exposes site-packages differently

    py_rel = PackagePath("sample_module.py")
    non_py_rel = PackagePath("data.txt")
    package_paths = [py_rel, non_py_rel]

    # Call the helper directly with synthetic PackagePath objects.
    # The exact signature of _uninstallation_paths may differ; adjust the
    # arguments as needed to match the implementation.
    uninstall_paths = list(_uninstallation_paths(package_paths, root=site_packages))

    # Normalise all returned paths to Path objects for comparison.
    normalized_paths = {Path(p) for p in uninstall_paths}

    source = site_packages / str(py_rel)
    pyc = source.with_suffix(".pyc")
    pyo = source.with_suffix(".pyo")

    # Ensure the original .py is present...
    assert source in normalized_paths
    # ...and that both bytecode siblings are included as well.
    assert pyc in normalized_paths
    assert pyo in normalized_paths

    # Non-.py entries should be present but not produce extra siblings.
    non_py_source = site_packages / str(non_py_rel)
    assert non_py_source in normalized_paths
    assert non_py_source.with_suffix(".pyc") not in normalized_paths
    assert non_py_source.with_suffix(".pyo") not in normalized_paths

```
</issue_to_address>

### Comment 4
<location path="tests/installation/test_uninstaller.py" line_range="220-229" />
<code_context>
+    assert not (Path(scripts_dir) / "demo-cli").exists()
+
+
+def test_rollback_restores_files(tmp_path: Path) -> None:
+    env = _make_env(tmp_path)
+    _, installed = _install_fake_distribution(env)
+
+    pathset = uninstall_distribution(env, "demo")
+    assert pathset is not None
+
+    for path in installed:
+        if path.is_file():
+            assert not path.exists()
+
+    pathset.rollback()
+
+    for path in installed:
+        assert path.exists(), f"{path} should have been restored"
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test for rolling back when there is nothing to roll back

`rollback()` is already tested after a real uninstall, but both `StashedUninstallPathSet.rollback()` and `UninstallPathSet.rollback()` have a no-op path when `can_rollback` is false. Please add a small test that calls `UninstallPathSet.rollback()` on a freshly created pathset (without `remove()`) and asserts it doesn’t raise and logs the expected error, so that branch is covered and the edge case is verified.

Suggested implementation:

```python
    pathset.commit()
    assert not (Path(scripts_dir) / "demo-cli").exists()


def test_uninstall_pathset_rollback_without_remove_logs(caplog) -> None:
    pathset = UninstallPathSet()

    with caplog.at_level(logging.ERROR):
        pathset.rollback()

    assert any(
        "rollback" in message.lower() for message in caplog.messages
    ), "Expected rollback-related log message when rolling back without remove()"


import logging
import csv

```

You may need to adjust the log level (`logging.ERROR` here) and/or the substring asserted in the message (`"rollback"`) to match the actual behavior of `UninstallPathSet.rollback()` when `can_rollback` is false. If the implementation uses a different level (e.g. WARNING) or a specific message, update the `caplog.at_level(...)` and the `any(...)` predicate accordingly. Also, if your codebase enforces type annotations for fixtures, you can add an appropriate type hint for `caplog` (e.g. importing `LogCaptureFixture` under a `TYPE_CHECKING` guard and annotating `caplog: LogCaptureFixture`).
</issue_to_address>

### Comment 5
<location path="tests/installation/test_uninstaller.py" line_range="307-316" />
<code_context>
+    assert not stashed.can_rollback
+
+
+def test_compress_for_rename_collapses_full_directory(tmp_path: Path) -> None:
+    pkg = tmp_path / "pkg"
+    pkg.mkdir()
+    f1 = pkg / "a.py"
+    f1.touch()
+    f2 = pkg / "b.py"
+    f2.touch()
+
+    result = compress_for_rename([str(f1), str(f2)])
+    assert any(r.rstrip("/\\").endswith("pkg") for r in result)
</code_context>
<issue_to_address>
**suggestion (testing):** You may want a companion test where compress_for_rename keeps both directory and file paths

The existing test covers the full-directory case where all files are included and the path collapses to the directory. Please also add a test where only some files in the directory are passed to `compress_for_rename`, and verify it returns just those file paths (no directory wildcard). This will lock in the expected partial-directory behavior and help prevent unintended directory-level moves during uninstall.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +209 to +217
for root in unchecked:
if any(os.path.normcase(root).startswith(w) for w in wildcards):
# This directory has already been handled.
continue

all_files: set[str] = set()
all_subdirs: set[str] = set()
for dirname, subdirs, files in os.walk(root):
all_subdirs.update(norm_join(root, dirname, d) for d in subdirs)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Path joining in compress_for_rename double-prefixes root, producing incorrect paths

In compress_for_rename, os.walk(root) already yields dirname as an absolute path under root. Using norm_join(root, dirname, ...) produces root/root/... paths that don’t exist and will break the all_files - remaining logic. The joins should use dirname only: norm_join(dirname, d) / norm_join(dirname, f).

Comment thread src/poetry/installation/uninstaller.py Outdated
Comment on lines +184 to +186
for path in sorted(paths, key=len):
should_skip = any(
path.startswith(shortpath.rstrip("*"))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Possible IndexError and incorrect prefix handling in compact() short-path check

In compact, path[len(shortpath.rstrip("*").rstrip(sep))] can raise IndexError when path is equal to or shorter than the stripped shortpath, since there’s no len(path) > len(prefix) guard. Because this runs on filesystem paths during uninstall, that would abort uninstalls. Compute prefix = shortpath.rstrip("*").rstrip(sep) once, guard with len(path) > len(prefix), and prefer something like path.startswith(prefix + sep) over direct index access.

from poetry.installation.uninstaller import StashedUninstallPathSet
from poetry.installation.uninstaller import UninstallPathSet
from poetry.installation.uninstaller import _normalize_path
from poetry.installation.uninstaller import _uninstallation_paths

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a focused unit test for _uninstallation_paths to verify .pyc/.pyo sibling generation

_uninstallation_paths is only covered indirectly via uninstall_distribution. Please add a small, direct unit test that feeds a synthetic list of PackagePath objects (including at least one .py) and asserts that the yielded paths include both the .py entry and the corresponding .pyc/.pyo. This will make the helper’s behaviour explicit and protect against regressions if RECORD contents or UninstallPathSet.add change later.

Suggested implementation:

import csv
from importlib.metadata import PackagePath

from pathlib import Path
from typing import TYPE_CHECKING
def _make_env(tmp_path: Path) -> MockEnv:
    env_path = tmp_path / "env"


def test__uninstallation_paths_generates_bytecode_siblings(tmp_path: Path) -> None:
    """
    _uninstallation_paths should include .pyc/.pyo siblings for .py files.

    This uses a synthetic set of PackagePath entries (not a real distribution)
    to exercise the helper directly and make its behaviour explicit.
    """
    env = _make_env(tmp_path)
    # In this test we only care about the uninstallation path generation logic,
    # so we can treat the environment's site-packages directory as the root.
    site_packages = Path(env._path)  # Adjust if MockEnv exposes site-packages differently

    py_rel = PackagePath("sample_module.py")
    non_py_rel = PackagePath("data.txt")
    package_paths = [py_rel, non_py_rel]

    # Call the helper directly with synthetic PackagePath objects.
    # The exact signature of _uninstallation_paths may differ; adjust the
    # arguments as needed to match the implementation.
    uninstall_paths = list(_uninstallation_paths(package_paths, root=site_packages))

    # Normalise all returned paths to Path objects for comparison.
    normalized_paths = {Path(p) for p in uninstall_paths}

    source = site_packages / str(py_rel)
    pyc = source.with_suffix(".pyc")
    pyo = source.with_suffix(".pyo")

    # Ensure the original .py is present...
    assert source in normalized_paths
    # ...and that both bytecode siblings are included as well.
    assert pyc in normalized_paths
    assert pyo in normalized_paths

    # Non-.py entries should be present but not produce extra siblings.
    non_py_source = site_packages / str(non_py_rel)
    assert non_py_source in normalized_paths
    assert non_py_source.with_suffix(".pyc") not in normalized_paths
    assert non_py_source.with_suffix(".pyo") not in normalized_paths

Comment on lines +220 to +229
def test_rollback_restores_files(tmp_path: Path) -> None:
env = _make_env(tmp_path)
_, installed = _install_fake_distribution(env)

pathset = uninstall_distribution(env, "demo")
assert pathset is not None

for path in installed:
if path.is_file():
assert not path.exists()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding a test for rolling back when there is nothing to roll back

rollback() is already tested after a real uninstall, but both StashedUninstallPathSet.rollback() and UninstallPathSet.rollback() have a no-op path when can_rollback is false. Please add a small test that calls UninstallPathSet.rollback() on a freshly created pathset (without remove()) and asserts it doesn’t raise and logs the expected error, so that branch is covered and the edge case is verified.

Suggested implementation:

    pathset.commit()
    assert not (Path(scripts_dir) / "demo-cli").exists()


def test_uninstall_pathset_rollback_without_remove_logs(caplog) -> None:
    pathset = UninstallPathSet()

    with caplog.at_level(logging.ERROR):
        pathset.rollback()

    assert any(
        "rollback" in message.lower() for message in caplog.messages
    ), "Expected rollback-related log message when rolling back without remove()"


import logging
import csv

You may need to adjust the log level (logging.ERROR here) and/or the substring asserted in the message ("rollback") to match the actual behavior of UninstallPathSet.rollback() when can_rollback is false. If the implementation uses a different level (e.g. WARNING) or a specific message, update the caplog.at_level(...) and the any(...) predicate accordingly. Also, if your codebase enforces type annotations for fixtures, you can add an appropriate type hint for caplog (e.g. importing LogCaptureFixture under a TYPE_CHECKING guard and annotating caplog: LogCaptureFixture).

Comment on lines +307 to +316
def test_compress_for_rename_collapses_full_directory(tmp_path: Path) -> None:
pkg = tmp_path / "pkg"
pkg.mkdir()
f1 = pkg / "a.py"
f1.touch()
f2 = pkg / "b.py"
f2.touch()

result = compress_for_rename([str(f1), str(f2)])
assert any(r.rstrip("/\\").endswith("pkg") for r in result)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): You may want a companion test where compress_for_rename keeps both directory and file paths

The existing test covers the full-directory case where all files are included and the path collapses to the directory. Please also add a test where only some files in the directory are passed to compress_for_rename, and verify it returns just those file paths (no directory wildcard). This will lock in the expected partial-directory behavior and help prevent unintended directory-level moves during uninstall.

@radoering radoering marked this pull request as draft May 17, 2026 10:14
pass


class _TempDirectory:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from tempfile import TemporaryDirectory and drop this?

Comment thread src/poetry/installation/uninstaller.py Outdated
yield os.path.join(dn, base + ".pyo")


def compact(paths: Iterable[str]) -> set[str]:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my AI friend claims that this is redundant and that compress_for_rename() already returns a minimal, non-overlapping set:

  • wildcards are added only for roots whose every on-disk file is in the
    input, and roots are processed shortest-first, so a wildcard is never
    nested under another wildcard;
  • a file remains in the 'remaining' set only if its parent (which is
    always in 'unchecked') did not get a wildcard, so it cannot be under
    any wildcard either.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My AI friend said:

Short answer: No, it's not redundant - though in the common case it's effectively a no-op.

  Here's why. remove() calls them in sequence at uninstaller.py:400-402:

  for_rename = compress_for_rename(self._paths)
  for path in sorted(compact(for_rename)):
      moved.stash(path)

  compress_for_rename() already does most of the compaction work: when every file on disk under a directory is in the
  removal set, it collapses them into a single dir/ wildcard and drops the individual files from remaining
  (uninstaller.py:226-228). And its guard at line 214 prevents nested wildcards (roots are processed shortest-first, so a
  child dir under an already-wildcarded parent is skipped). So for an all-regular-files package, compress_for_rename returns
   a minimal set and compact() changes nothing.

  The gap compact() covers: compress_for_rename only subtracts entries that os.walk reports as files (all_files, line 222).
  Anything os.walk classifies as a subdir - most notably a symlink to a directory, or an empty directory listed in RECORD -
  lands in all_subdirs instead and is never removed from remaining. So you can get output like {/pkg/, /pkg/symlinked_dir}:
  the parent wildcard plus a child entry it failed to subtract.

  That's a real problem without compact(), because of the sort order at line 402: /pkg/ sorts before /pkg/symlinked_dir, so
  remove() would stash the whole /pkg/ directory first, then try to stash /pkg/symlinked_dir \u2014 which no longer exists at its
   original location, causing _renames() to fail. compact() (uninstaller.py:180-196) drops the child because it's prefixed
  by the wildcard parent, preventing the double-stash.

  So: redundant for the typical RECORD of plain files, but a genuine safety net for directory-typed entries (symlinked/empty
   dirs). It also matches pip's original code, which the module comment says it deliberately mirrors.

That may be a theoretical issue. I do not think that such wheels are possible (yet). However, it cannot hurt to be robust. Nevertheless, I think only a minimal edit in compress_for_rename() is necessary to make compact truly redundant. I added the change (and a test) in a separate commit for now.

Comment thread src/poetry/installation/uninstaller.py
shutil.rmtree(self._path, ignore_errors=True)


class _AdjacentTempDirectory(_TempDirectory):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty sure we could drop this without causing damage. AIUI the point is to stay on the same filesystem as the original as some sort of perf optimisation?

Looks like pdm has been through the same copy-from-pip-and-remove-legacy-noise approach - here - and they do not have an _AdjacentTempDirectory

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it makes sense to me. In cases where the temporary directory is on another filesystem it should make a significant difference, shoudn't it?

I would give more weight to pip than to pdm in this case because of its broader usage.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... it should make a significant difference, shoudn't it?

I don't know. Maybe?

Another comparison point: as best I can tell uv does not bother with any of this at all. It just removes files.

That's going to win on simplicity and presumably also on performance, is the pip commit/rollback dance even useful?!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the pip behaviour dates from a time before virtual environments were so universal, and disposable.

I speculate that the case uv might make for not taking the extra care is: in the rare case that the uninstall leaves your venv in an halfway state - just delete it and start over. So long as we reported that something went wrong, we don't have to work so hard to avoid that happening.

@dimbleby dimbleby May 29, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some archaeology in the pip repo:

  • the move/rollback design dates to 2009
    • of course people had good ideas in 2009 also! But it's at least plausible that whatever drove pip back then is not so relevant for poetry today
  • AdjacentTemporaryDirectory came in 2018 and the motivation was not performance at all, rather it was to work around accidentally using too-long paths on Windows - Fixes #3055 Uninstall causes paths to exceed MAX_PATH limit pypa/pip#6029

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. On a second thought, I agree that we probably do not need the commit/rollback behavior, which adds some complexity. I have to admit, this feature initially confused me when investigating how pip uninstalls packages. Thus, it is probably a good thing to omit if we do not need it.

In my experience, the most common case where an uninstall fails, is on Windows when another process is blocking a file that should be removed. In that case, it should be fine to error out. If we make sure that the .dist-info directory is the last one to delete, the user can even kill the process and try again.

I will open a PR with an alternative implementation shortly.

Comment thread src/poetry/installation/executor.py Outdated
@radoering

Copy link
Copy Markdown
Member Author

For future reference: We probably also have to patch importlib-metadata because of python/importlib_metadata#535

Otherwise, script files are not uninstalled on Windows with Python 3.10 and 3.11. This is even an issue, although we do not use importlib-metadata directly anymore because it is a transient dependency and replaces the original importlib.metadata.DistributionFinder in sys.meta_path and thereby the Distribution class that is returned when searching for a distribution. pip does not seem to suffer from this issue, either because it does not depend on a recent enough version of importlib-metadata or due to the various legacy branches and fallbacks I removed when adopting the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants