Skip to content

fix(oxml): scope CT_GroupShape.max_shape_id to spTree descendants (post-#52 hotfix)#53

Merged
MHoroszowski merged 1 commit into
masterfrom
fix/master-shape-id-overflow
May 14, 2026
Merged

fix(oxml): scope CT_GroupShape.max_shape_id to spTree descendants (post-#52 hotfix)#53
MHoroszowski merged 1 commit into
masterfrom
fix/master-shape-id-overflow

Conversation

@MHoroszowski
Copy link
Copy Markdown
Owner

Hotfix: shape-id overflow on slide masters (post-#52)

PowerPoint flagged uat/out_headers_footers_phase5.pptx from #52 with a "Repair" dialog and removed the watermark shape. Tracked the root cause to a latent bug in CT_GroupShape.max_shape_id that the new MasterShapes.add_textbox / SlideMaster.add_text_watermark API was the first public path to exercise.

Root cause

src/pptx/oxml/shapes/groupshape.py:163 used self.xpath("//@id"). // is absolute in XPath — scoped to the whole document, not spTree descendants. On a slide master, the sibling <p:sldLayoutIdLst> contains <p:sldLayoutId id="2147483649+"/> entries — those are pointer ids in a deliberately-segregated namespace (PowerPoint's authoring convention for layout-master references).

max_shape_id swallowed those, returning 2147483659, and _next_shape_id returned 2147483660 for the new watermark shape. PowerPoint's internal handlers treat shape ids as xs:int — anything ≥ 2^31 gets quarantined on open, with the offending shape stripped and replacement parts synthesized (the slideMaster2.xml / slideLayout12.xml / theme2.xml we saw in the repair diff).

Why dormant until now

Bug has existed in upstream python-pptx since max_shape_id was written. Slides and layouts don't have sibling elements with id attrs in the danger range. Only slide masters do, via <p:sldLayoutIdLst>. Phase 5 (#52) introduced the first public API that writes a shape to a slide master (MasterShapes.add_textbox), so the bug only surfaced now.

Fix

One-character change: self.xpath("//@id")self.xpath(".//@id"). .// is the descendant-or-self axis relative to the context node — scopes the search to the shape tree subtree only.

- id_str_lst = self.xpath("//@id")
+ id_str_lst = self.xpath(".//@id")

Plus a sharpened docstring naming the failure mode and a comment pointing at the surfacing path (SlideMaster.add_text_watermark) so future-me doesn't reintroduce the broader-scope version.

Tests

Three new regression tests in tests/oxml/shapes/test_groupshape.py:

  • it_returns_zero_for_max_shape_id_on_an_empty_spTree — baseline.
  • it_finds_the_max_id_among_shape_descendants — sanity, picks the max across three sp ids.
  • but_it_ignores_sibling_ids_outside_the_shape_treeregression pin. Builds a <p:sldMaster> fixture with <p:sldLayoutIdLst> siblings carrying id="2147483649", id="2147483650". Asserts spTree.max_shape_id == 2 (the highest shape id), not 2147483650. This is the exact scenario the watermark hit.

Verification (local, CPython 3.14.4)

python3 -m pytest tests/ -q                       → 3654 passed (+3 vs Phase 5 baseline)
python3 -m ruff check src tests                   → All checks passed!
python3 -m ruff format --check src tests          → 216 files already formatted
python3 -m behave features/ --no-color            → 1048 scenarios, 0 failed
python3 uat/uat_headers_footers_phase5.py         → script runs clean; watermark shape_id=7 (was 2147483660 pre-fix)

Maintainer visual UAT signoff received — the regenerated uat/out_headers_footers_phase5.pptx opens cleanly in PowerPoint without the Repair dialog.

Refs #52 (Phase 5 of #20).

The xpath `//@id` in `CT_GroupShape.max_shape_id` is absolute — it
scopes to the whole document, not the spTree subtree. On slide masters,
that pulls in the sentinel ids on `<p:sldLayoutIdLst>/<p:sldLayoutId>`
which start at 2147483649 (PowerPoint's authoring convention for
layout-master pointer ids — a deliberately-segregated namespace).

The next-shape-id calculation `max_shape_id + 1` then returns a value
just above `xs:int` (2147483660). PowerPoint quarantines shapes with
ids in that range on open, surfacing as a 'Repair' dialog that removes
the offending shape and synthesizes replacement parts.

The latent bug was harmless until Phase 5 (#52) introduced
`MasterShapes.add_textbox` and `SlideMaster.add_text_watermark` — the
first public API that *writes* a shape to a slide master. Reads off
masters didn't exercise `_next_shape_id`.

Fix: change `self.xpath('//@id')` to `self.xpath('.//@id')`. `.//` is
the descendant-or-self axis relative to the context node — scopes the
search to the shape tree only, leaving sibling-element ids
(`p:sldLayoutId`, etc.) outside the max.

Tests:
- `it_returns_zero_for_max_shape_id_on_an_empty_spTree` — baseline.
- `it_finds_the_max_id_among_shape_descendants` — sanity, picks
  max=42 across three sp ids.
- `but_it_ignores_sibling_ids_outside_the_shape_tree` — regression
  pin: `<p:sldLayoutIdLst>` siblings with ids in the 2147483649+
  range do NOT inflate the result. spTree.max_shape_id stays at 2.

Verification (local, CPython 3.14.4):
- python3 -m pytest tests/ -q                       → 3654 passed (+3 vs Phase 5 baseline)
- python3 -m ruff check src tests                   → All checks passed!
- python3 -m ruff format --check src tests          → 216 files already formatted
- python3 -m behave features/ --no-color            → 1048 scenarios, 0 failed
- python3 uat/uat_headers_footers_phase5.py         → script runs clean;
  watermark shape_id=7 (was 2147483660 pre-fix). UAT signoff is
  maintainer-only per CLAUDE.md §6a — please re-open the regenerated
  uat/out_headers_footers_phase5.pptx in PowerPoint to confirm.

Refs #52.
@MHoroszowski MHoroszowski merged commit 0bd67aa into master May 14, 2026
@MHoroszowski MHoroszowski deleted the fix/master-shape-id-overflow branch May 14, 2026 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant