Summary
reusable-check-python-package-versions.yaml uses pip download "pkg==X" to test whether version X is already published in CodeArtifact. PEP 440's == operator strips local-version labels from candidate versions during matching, so a published build with a local segment (e.g. 0.1.1.dev1+spark.codegen) satisfies the specifier ==0.1.1.dev1. The workflow then concludes the version "already exists" and fails the PR, even though the bare 0.1.1.dev1 is genuinely unpublished and would not collide on upload.
Observed failure
-
packages/overture-schema-system/src/overture/schema/system/__about__.py declares __version__ = "0.1.1.dev1".
-
CodeArtifact contains overture-schema-system 0.1.1.dev1+spark.codegen (published from the spark-codegen branch).
-
The "Fail if any of the new versions already exist in the repo" step runs:
pip download "overture-schema-system==0.1.1.dev1" --index-url $INDEX_URL --no-deps ...
pip finds 0.1.1.dev1+spark.codegen, strips the local segment, treats it as 0.1.1.dev1, exits 0. The shell guard interprets the success as "version exists" and fails the workflow.
Root cause
PEP 440 specifier matching, confirmed with the packaging library:
spec: ==0.1.1.dev1
'0.1.1.dev1' -> True
'0.1.1.dev1+spark.codegen' -> True <-- collision
'0.1.1.dev3+spark.codegen' -> False
spec: ===0.1.1.dev1
'0.1.1.dev1' -> True
'0.1.1.dev1+spark.codegen' -> False
'0.1.1.dev3+spark.codegen' -> False
== is documented to ignore local-version labels on candidates when the specifier itself carries none. The check needs literal-string equality against the version we are about to publish, not PEP 440 equality.
Recommended actions
1. Switch the existence check to === (primary fix)
.github/workflows/reusable-check-python-package-versions.yaml:158:
-output=\$(uv run pip download "\${package}==\${after}" ...)
+output=\$(uv run pip download "\${package}===\${after}" ...)
=== is PEP 440 arbitrary equality (literal-string match). pip supports it, emits the same "Could not find a version" message on miss, and correctly distinguishes 0.1.1.dev1 from 0.1.1.dev1+spark.codegen. metadata.version() returns the canonical PEP 440 form, so a literal compare against the registry is sound.
2. Fix plural typo in the error-message guard
.github/workflows/reusable-check-python-package-versions.yaml:161:
- "\${output,,}" != *"no matching distributions"*
+ "\${output,,}" != *"no matching distribution"*
pip emits "No matching distribution" (singular). The plural form never matches today; the workflow only passes the miss case because the sibling "could not find a version" check happens to fire first. A future pip wording change to either substring would silently break the guard.
3. Skip entries where after is null
When a package is removed between commits, compare() emits {"package": "X", "before": "...", "after": null} and the shell loop runs pip download "X===null". pip rejects the spec and prints "could not find a version", so the loop accidentally passes -- for the wrong reason. Guard the loop so deletions are not treated as new-version publications:
+ if [[ "\$after" == "null" ]]; then
+ echo "Package \${package} was removed; skipping existence check."
+ continue
+ fi
exit_code=0
output=\$(uv run pip download ...)
Summary
reusable-check-python-package-versions.yamlusespip download "pkg==X"to test whether versionXis already published in CodeArtifact. PEP 440's==operator strips local-version labels from candidate versions during matching, so a published build with a local segment (e.g.0.1.1.dev1+spark.codegen) satisfies the specifier==0.1.1.dev1. The workflow then concludes the version "already exists" and fails the PR, even though the bare0.1.1.dev1is genuinely unpublished and would not collide on upload.Observed failure
packages/overture-schema-system/src/overture/schema/system/__about__.pydeclares__version__ = "0.1.1.dev1".CodeArtifact contains
overture-schema-system 0.1.1.dev1+spark.codegen(published from the spark-codegen branch).The "Fail if any of the new versions already exist in the repo" step runs:
pip finds
0.1.1.dev1+spark.codegen, strips the local segment, treats it as0.1.1.dev1, exits 0. The shell guard interprets the success as "version exists" and fails the workflow.Root cause
PEP 440 specifier matching, confirmed with the
packaginglibrary:==is documented to ignore local-version labels on candidates when the specifier itself carries none. The check needs literal-string equality against the version we are about to publish, not PEP 440 equality.Recommended actions
1. Switch the existence check to
===(primary fix).github/workflows/reusable-check-python-package-versions.yaml:158:===is PEP 440 arbitrary equality (literal-string match). pip supports it, emits the same "Could not find a version" message on miss, and correctly distinguishes0.1.1.dev1from0.1.1.dev1+spark.codegen.metadata.version()returns the canonical PEP 440 form, so a literal compare against the registry is sound.2. Fix plural typo in the error-message guard
.github/workflows/reusable-check-python-package-versions.yaml:161:pip emits "No matching distribution" (singular). The plural form never matches today; the workflow only passes the miss case because the sibling
"could not find a version"check happens to fire first. A future pip wording change to either substring would silently break the guard.3. Skip entries where
afteris nullWhen a package is removed between commits,
compare()emits{"package": "X", "before": "...", "after": null}and the shell loop runspip download "X===null". pip rejects the spec and prints "could not find a version", so the loop accidentally passes -- for the wrong reason. Guard the loop so deletions are not treated as new-version publications: