Skip to content

Fix Windows script wrappers for UTF-8 paths#10946

Open
cyphercodes wants to merge 1 commit into
python-poetry:mainfrom
cyphercodes:fix-windows-accented-script-paths-10193
Open

Fix Windows script wrappers for UTF-8 paths#10946
cyphercodes wants to merge 1 commit into
python-poetry:mainfrom
cyphercodes:fix-windows-accented-script-paths-10193

Conversation

@cyphercodes

Copy link
Copy Markdown

Pull Request Check List

Fixes #10193

  • Added tests for changed code.
  • Updated documentation for changed code. (Not applicable; behavior-only bug fix.)

Summary

Editable installs on Windows write .cmd script wrappers as UTF-8. When those wrappers are launched from a non-UTF-8 console code page, non-ASCII paths can be mojibake before cmd.exe invokes Python.

This changes the generated .cmd wrapper to switch cmd.exe to UTF-8 before the command containing the Python/script paths is parsed, and adds a regression test covering a Windows-style path containing Área de Trabalho.

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Consider capturing the original console code page and restoring it after invoking Python so that the wrapper doesn’t leave the terminal permanently switched to UTF-8 as a side effect.
  • The hard-coded C:\Users\jmoni\... path in the test looks personal and slightly brittle; using a more generic or tmp_path-derived Windows-style path with non-ASCII components would keep the intent while avoiding user-specific details.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider capturing the original console code page and restoring it after invoking Python so that the wrapper doesn’t leave the terminal permanently switched to UTF-8 as a side effect.
- The hard-coded `C:\Users\jmoni\...` path in the test looks personal and slightly brittle; using a more generic or `tmp_path`-derived Windows-style path with non-ASCII components would keep the intent while avoiding user-specific details.

## Individual Comments

### Comment 1
<location path="src/poetry/masonry/builders/editable.py" line_range="39-41" />
<code_context>
-WINDOWS_CMD_TEMPLATE = """\
-@echo off\r\n"{python}" "%~dp0\\{script}" %*\r\n
-"""
+# The .cmd wrapper is written as UTF-8. Switch cmd.exe to UTF-8 before it
+# parses paths so non-ASCII virtualenv locations are handled correctly.
+WINDOWS_CMD_TEMPLATE = (
+    '@echo off\r\nchcp 65001 >nul\r\n"{python}" "%~dp0\\{script}" %*\r\n'
+)
</code_context>
<issue_to_address>
**question:** Consider whether unconditionally forcing code page 65001 is safe for all environments that use this wrapper.

Changing the active code page will affect all subsequent commands in that `cmd.exe` session (including subprocesses spawned by Python). Given this wrapper runs in diverse contexts (CI, editors, etc.), please verify that always running `chcp 65001` won’t break cases that rely on a non‑UTF‑8 console. If that’s a risk, consider gating this behind an env flag or only enabling it when non‑ASCII paths are detected.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +39 to +41
# The .cmd wrapper is written as UTF-8. Switch cmd.exe to UTF-8 before it
# parses paths so non-ASCII virtualenv locations are handled correctly.
WINDOWS_CMD_TEMPLATE = (

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Consider whether unconditionally forcing code page 65001 is safe for all environments that use this wrapper.

Changing the active code page will affect all subsequent commands in that cmd.exe session (including subprocesses spawned by Python). Given this wrapper runs in diverse contexts (CI, editors, etc.), please verify that always running chcp 65001 won’t break cases that rely on a non‑UTF‑8 console. If that’s a risk, consider gating this behind an env flag or only enabling it when non‑ASCII paths are detected.

@dosubot

dosubot Bot commented Jun 8, 2026

Copy link
Copy Markdown

Related Knowledge

1 document with suggested updates is ready for review.

Python Poetry

CHANGELOG /poetry/blob/main/CHANGELOG.md — ⏳ Awaiting Merge

How did I do? Any feedback?  Join Discord

@cyphercodes cyphercodes force-pushed the fix-windows-accented-script-paths-10193 branch from c79692e to 97e6fcf Compare June 8, 2026 03:50
@dimbleby

dimbleby commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Calling chcp seems like overreach, and a quick google will find reasons not to do it eg here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issue with Accented Characters in Script Paths Generated by [project.scripts] on Windows 11 (PT-BR)

2 participants