fix(oxml): scope CT_GroupShape.max_shape_id to spTree descendants (post-#52 hotfix)#53
Merged
Merged
Conversation
The xpath `//@id` in `CT_GroupShape.max_shape_id` is absolute — it scopes to the whole document, not the spTree subtree. On slide masters, that pulls in the sentinel ids on `<p:sldLayoutIdLst>/<p:sldLayoutId>` which start at 2147483649 (PowerPoint's authoring convention for layout-master pointer ids — a deliberately-segregated namespace). The next-shape-id calculation `max_shape_id + 1` then returns a value just above `xs:int` (2147483660). PowerPoint quarantines shapes with ids in that range on open, surfacing as a 'Repair' dialog that removes the offending shape and synthesizes replacement parts. The latent bug was harmless until Phase 5 (#52) introduced `MasterShapes.add_textbox` and `SlideMaster.add_text_watermark` — the first public API that *writes* a shape to a slide master. Reads off masters didn't exercise `_next_shape_id`. Fix: change `self.xpath('//@id')` to `self.xpath('.//@id')`. `.//` is the descendant-or-self axis relative to the context node — scopes the search to the shape tree only, leaving sibling-element ids (`p:sldLayoutId`, etc.) outside the max. Tests: - `it_returns_zero_for_max_shape_id_on_an_empty_spTree` — baseline. - `it_finds_the_max_id_among_shape_descendants` — sanity, picks max=42 across three sp ids. - `but_it_ignores_sibling_ids_outside_the_shape_tree` — regression pin: `<p:sldLayoutIdLst>` siblings with ids in the 2147483649+ range do NOT inflate the result. spTree.max_shape_id stays at 2. Verification (local, CPython 3.14.4): - python3 -m pytest tests/ -q → 3654 passed (+3 vs Phase 5 baseline) - python3 -m ruff check src tests → All checks passed! - python3 -m ruff format --check src tests → 216 files already formatted - python3 -m behave features/ --no-color → 1048 scenarios, 0 failed - python3 uat/uat_headers_footers_phase5.py → script runs clean; watermark shape_id=7 (was 2147483660 pre-fix). UAT signoff is maintainer-only per CLAUDE.md §6a — please re-open the regenerated uat/out_headers_footers_phase5.pptx in PowerPoint to confirm. Refs #52.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hotfix: shape-id overflow on slide masters (post-#52)
PowerPoint flagged
uat/out_headers_footers_phase5.pptxfrom #52 with a "Repair" dialog and removed the watermark shape. Tracked the root cause to a latent bug inCT_GroupShape.max_shape_idthat the newMasterShapes.add_textbox/SlideMaster.add_text_watermarkAPI was the first public path to exercise.Root cause
src/pptx/oxml/shapes/groupshape.py:163usedself.xpath("//@id").//is absolute in XPath — scoped to the whole document, not spTree descendants. On a slide master, the sibling<p:sldLayoutIdLst>contains<p:sldLayoutId id="2147483649+"/>entries — those are pointer ids in a deliberately-segregated namespace (PowerPoint's authoring convention for layout-master references).max_shape_idswallowed those, returning2147483659, and_next_shape_idreturned2147483660for the new watermark shape. PowerPoint's internal handlers treat shape ids asxs:int— anything ≥2^31gets quarantined on open, with the offending shape stripped and replacement parts synthesized (theslideMaster2.xml/slideLayout12.xml/theme2.xmlwe saw in the repair diff).Why dormant until now
Bug has existed in upstream python-pptx since
max_shape_idwas written. Slides and layouts don't have sibling elements with id attrs in the danger range. Only slide masters do, via<p:sldLayoutIdLst>. Phase 5 (#52) introduced the first public API that writes a shape to a slide master (MasterShapes.add_textbox), so the bug only surfaced now.Fix
One-character change:
self.xpath("//@id")→self.xpath(".//@id")..//is the descendant-or-self axis relative to the context node — scopes the search to the shape tree subtree only.Plus a sharpened docstring naming the failure mode and a comment pointing at the surfacing path (
SlideMaster.add_text_watermark) so future-me doesn't reintroduce the broader-scope version.Tests
Three new regression tests in
tests/oxml/shapes/test_groupshape.py:it_returns_zero_for_max_shape_id_on_an_empty_spTree— baseline.it_finds_the_max_id_among_shape_descendants— sanity, picks the max across three sp ids.but_it_ignores_sibling_ids_outside_the_shape_tree— regression pin. Builds a<p:sldMaster>fixture with<p:sldLayoutIdLst>siblings carryingid="2147483649",id="2147483650". AssertsspTree.max_shape_id == 2(the highest shape id), not 2147483650. This is the exact scenario the watermark hit.Verification (local, CPython 3.14.4)
Maintainer visual UAT signoff received — the regenerated
uat/out_headers_footers_phase5.pptxopens cleanly in PowerPoint without the Repair dialog.Refs #52 (Phase 5 of #20).