docs: custom autosummary template + residual triage (PR 2.5) by igerber · Pull Request #410 · igerber/diff-diff

igerber · 2026-05-10T00:46:16Z

Summary

Closes the Sphinx WARNING gap PR #407 left. PR #407 dropped warnings from 2,170 to 635 (71%) by adding `:no-index:` to all 78 page-level `.. autoclass::` directives. The empirical spike on `bacon.rst` revealed that `:no-index:` on autoclass suppresses CLASS-level cross-reference registration only — it doesn't propagate to autodoc-generated members when `:members:` is implicit (via `autodoc_default_options["members"] = True` in `docs/conf.py:39-44`). 614 of those 635 residuals were member-level duplicates.

PR 2.5 closes both that gap and the 21 non-duplicate residuals so PR 3 (queued: `-W` enforcement in `docs-tests.yml`) can land cleanly.

Result: `make html` is now 0 ERRORs / 0 WARNINGs (down from 2,170 at PR #407 start; 635 at PR #407 merge).

Architecture

User-decided this session: custom autosummary class template that adds `:no-members:` to the autoclass in stubs (Option A). Effect:

Stubs: header + autosummary tables linking to page-level (no inline member docs)
Page-level pages: keep current full inline member rendering (UX preserved)
Class cross-refs `:py:class:DiDResults` → stub URL (per PR docs: add :no-index: to page-level autoclass directives (PR 2 of 2) #407 page-level `:no-index:`)
Member cross-refs `:py:meth:DiDResults.summary` → page-level URL (only place still registered)
Function cross-refs `:py:func:diff_diff.compute_power` → page-level URL (stub now `:no-index:` via custom function.rst template)

Sphinx's `autosummary_generate_overwrite=True` default means a single clean build regenerates all 50 class stubs + 48 function stubs from the new templates — no hand-edit per stub.

Spike-before-bulk discipline

Per `feedback_spike_before_bulk_mechanical_edit` (lesson saved from PR #407): the plan included an explicit empirical spike of just the class.rst template. The spike caught that 41 of 44 remaining residuals were FUNCTION duplicates (not class), surfacing the need for a function.rst template too. The remaining 3 page-vs-page were pinned to specific module-name conflicts (`diff_diff.chaisemartin_dhaultfoeuille`, `diff_diff.stacked_did`, `diff_diff.trop` — function names that shadow module names) and resolved with targeted page-level `:no-index:` on the autofunction. Without the spike, the bulk pass would have committed to a 1-template architecture that left ~44 function dups unhandled.

Constructor visibility note

The new class template removes the explicit `.. automethod:: init` line from stubs (would re-register `init` and defeat the `:no-members:` suppression). The constructor signature still appears on page-level via `autodoc_class_signature = "separated"` (`docs/conf.py:46`). Stubs lose the per-class `init` rendering; page-level keeps it.

Residual triage (Cat A–E)

Five categories of non-duplicate warnings, all fixed:

Cat A — Source docstring (5): `SyntheticDiDResults.in_time_placebo:36` blank-line-before-prose; 4 unreferenced footnotes in `TripleDifference` (added `[1]`/`[2]` refs in narrative)
Cat B — Cross-refs (6): 4 `:doc:` refs to `.md` files that Sphinx can't resolve without `myst-parser` → converted to literal text per existing `docs/choosing_estimator.rst:302` pattern; 2 broken markdown links in T18/T19 notebooks (4 in T19 + 2 broken-rendered REGISTRY.html refs in T18 also fixed at the same time)
Cat C — Orphan citations (3): removed `[CS2021]`, `[AHIW2021]`, `[RR2023]` from `docs/benchmarks.rst` (grep-confirmed safe — no narrative references)
Cat D — Toctree (2): T19 + T20 added to `Tutorials: Business Applications` toctree at `docs/index.rst:80-83`
Cat E — Autosummary entry typos (2): `~DiDResults.ci` → `~DiDResults.conf_int`; removed nonexistent `~CallawaySantAnnaResults.aggregate`

Plan-reviewer found that the original Cat E plan (suggested `suppress_warnings = ["autosummary"]` based on "Rust backend not built") was wrong: Sphinx emits autosummary import-failure warnings without `type=`, so the suppress filter is a no-op. The actual fixes were RST typos.

Page-vs-page function fix (Cat F surprise)

Spike rebuild surfaced 3 final duplicates from `chaisemartin_dhaultfoeuille`, `stacked_did`, `trop` — function names that shadow their module names. The conflict was Python's same-namespace ambiguity: `from .X import X` shadows the module, but Sphinx still registers both. Fix: `:no-index:` on the page-level autofunction (3 sites). Also switched 3 path-qualified autoclass directives to top-level paths (`diff_diff.X.Y` → `diff_diff.Y`) for consistency with project convention (PR #406 export pattern).

Methodology references

N/A - no methodology changes; documentation infrastructure only.

Validation

Tests added/updated: No test changes; doc-snippet test parity preserved (111 passed, 4 skipped — same as PR docs: add :no-index: to page-level autoclass directives (PR 2 of 2) #407 baseline).
Backtest / simulation / notebook evidence: Spike rebuild iterations: baseline 635 warnings → 62 (class template only) → 21 (+ function template) → 18 (+ page-vs-page page-level autofunction `:no-index:`) → 1 (+ Cat A–E triage) → 0 (T18 link fix to `.rst` extension).
HTML spot-checks (verified):
- Page-level `results.html`: 281 (PR docs: add :no-index: to page-level autoclass directives (PR 2 of 2) #407) → 286 (PR 2.5) `<dt` count — slight increase from napoleon docstring rendering tweaks; full member docs preserved
- Stub `diff_diff.DiDResults.html`: class header + autosummary tables only; no inline member `
  ` for class docs (the 21 remaining are napoleon-rendered Attributes section, NOT registered)
- T19 + T20 in tutorials nav: confirmed
- 50 class stubs include `:no-members:`; 48 function stubs include `:no-index:`

Security / privacy

Confirm no secrets/PII in this PR: Yes

Closes the gap PR #407 left: 614 of 635 PR #407-residual warnings were member-level duplicate-object descriptions because :no-index: on page-level autoclass directives doesn't propagate to autodoc-generated members. PR 2.5 adds a custom autosummary class.rst template with :no-members: on the stub autoclass, and a function.rst template with :no-index: on the stub autofunction. All 50 class stubs and 48 function stubs are auto-regenerated on the next clean build (autosummary_generate_overwrite=True default). Architecture: - Class cross-refs resolve to stub URL (per PR #407 page-level :no-index:) - Member cross-refs resolve to page-level URL (only place still registered) - Function cross-refs resolve to page-level URL (stub now :no-index:) - Page-level pages keep current full inline member rendering (UX preserved) - Stubs become "header + autosummary tables linking to page-level" Also addresses 18 non-duplicate residuals so make html now warns 0: - 4 footnote-not-referenced + 1 def-list blank-line in source docstrings - 4 :doc: refs to .md files (myst-parser absent) converted to literal text - 4 broken markdown links to REGISTRY.html in T18/T19 notebooks - 3 orphan benchmarks.rst citations removed - T19 + T20 added to Tutorials: Business Applications toctree - 2 stale autosummary entries fixed (DiDResults.ci, CallawaySantAnnaResults.aggregate) - 3 page-vs-page function dups (chaisemartin_dhaultfoeuille, stacked_did, trop) resolved by adding :no-index: to page-level autofunction directives and switching 3 path-qualified autoclass directives to top-level paths Result: Sphinx make html went 2,170 (PR #407 start) -> 0 warnings (PR 2.5 end). This unblocks PR 3 (queued: -W enforcement in docs-tests.yml). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-10T00:52:06Z

✅ Looks good

Executive Summary

No estimator behavior, math, weighting, variance/SE, identification, or default behavior changes were introduced.
Methodology-impacting code changes are docstring-only and align with the Methodology Registry for SyntheticDiD and TripleDifference.
Autosummary class/function stub changes are broadly consistent and address the tracked Sphinx duplicate-warning tech debt.
I found one P2 documentation/inventory regression: three shadowed convenience functions are now unindexed everywhere.
No secrets found in the non-notebook diff. Full Sphinx build was not run; local import failed because numpy is not installed in this review environment.

Methodology

No P0/P1 findings.

Affected methods checked:

SyntheticDiDResults.in_time_placebo: docstring formatting only at diff_diff/results.py:L1269-L1281; no estimator logic changed. Registry documents the in-time placebo reuse of the fit snapshot normalization at docs/methodology/REGISTRY.md:L1660-L1666.
TripleDifference: reference footnote wiring only at diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481; no estimator logic changed. Registry primary source remains Ortiz-Villavicencio & Sant'Anna at docs/methodology/REGISTRY.md:L1696-L1725.

Code Quality

No P0/P1/P2 findings.

Pattern-wide inference grep found no new inline inference computation in the changed Python hunks. Existing project-wide hits are outside this PR’s changed code.

Performance

No findings.

The PR is documentation/template focused and does not alter runtime estimator paths.

Maintainability

No P0/P1 findings.

P3 — Informational: tracked docs tech debt addressed

Impact: The autosummary refresh and custom templates directly address existing Sphinx duplicate/stale-stub debt already tracked in TODO.md:L134-L139.

Concrete fix: No action required.

Tech Debt

No untracked blocker-level tech debt found.

The remaining limitation around documentation inventory below is not statistical correctness and does not block approval.

Security

No findings.

I grepped the non-notebook diff for common secret patterns and did not find credentials or private keys.

Documentation/Tests

P2 — Shadowed convenience functions lose Python-domain targets

Impact: The three page-level convenience functions that shadow module names are now documented with :no-index:: diff_diff.chaisemartin_dhaultfoeuille at docs/api/chaisemartin_dhaultfoeuille.rst:L159-L160, diff_diff.stacked_did at docs/api/stacked_did.rst:L70-L71, and diff_diff.trop at docs/api/trop.rst:L76-L77. Since these functions are not also present as indexed autosummary stubs, :py:func: cross-references and intersphinx inventory entries for those public convenience functions will not have a function target.

Concrete fix: Keep exactly one indexed target for each shadowed function. A clean shape is to replace the page .. module:: diff_diff.<name> directive with .. currentmodule:: diff_diff.<name> where needed, then remove :no-index: from the .. autofunction:: diff_diff.<name> directive so the public function remains registered without colliding with the module target.

P3 — Validation not locally reproduced

Impact: I did not run make html; this environment is missing runtime docs dependencies (import diff_diff fails on missing numpy). Static audits covered the diff, sibling surfaces, no-index/no-members patterns, workflow-path changes, and secret patterns.

Concrete fix: Rely on CI’s Sphinx build for final confirmation, or run make html in an environment with the docs/runtime dependencies installed.

Reviewer flagged that page-level :no-index: on .. autofunction:: (added in initial PR 2.5 commit to resolve 3 page-vs-page duplicates) made diff_diff.{trop,stacked_did,chaisemartin_dhaultfoeuille} convenience functions absent from the py:function inventory entirely - so :py:func: cross-refs and intersphinx pointers to those public functions wouldn't resolve. Empirical investigation surfaced the actual conflict source: each of the 3 files had a top-level ``.. module:: diff_diff.<name>`` directive that registered the module FQN, colliding with the function FQN of the same name. Removing the .. module:: directive lets .. autofunction:: register the function as the canonical (and only) :py:function entry. The .. autoclass:: directives on those pages were already converted to top-level paths (diff_diff.X form) in the initial PR 2.5 commit and don't depend on the implicit module context. Verified: make html shows 0 ERRORs / 0 WARNINGs / 0 duplicates; all 3 convenience functions present in objects.inv as py:function entries. Doc-snippet test parity preserved (111 passed, 4 skipped). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-05-10T14:07:53Z

/ai-review

github-actions · 2026-05-10T14:11:41Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 477c1c19cc38017481f832d2af5d7c2a022a977f

✅ Looks good

Executive Summary

Previous P2 finding is resolved: diff_diff.chaisemartin_dhaultfoeuille, diff_diff.stacked_did, and diff_diff.trop now have indexed page-level autofunction targets.
No estimator math, weighting, SE/variance, assumptions, defaults, or runtime behavior changed.
Methodology-touching edits are docstring/reference formatting only and match docs/methodology/REGISTRY.md.
Required audits completed: sibling-surface mirror, pattern-wide inference grep, reciprocal indexed-target check, and workflow-path audit. No workflow path changes were in scope.
One P2 documentation inventory issue remains: three plotting helper stubs are now :no-index: without a page-level indexed replacement.

Methodology

No P0/P1 findings.

Severity: P3-informational
Impact: SyntheticDiDResults.in_time_placebo changed only docstring spacing at diff_diff/results.py:L1271-L1281. Registry behavior for normalization, regularization reuse, and infeasibility-only NaN handling matches docs/methodology/REGISTRY.md:L1660-L1669.

Concrete fix: No action required.

Severity: P3-informational
Impact: TripleDifference changed only docstring footnote wiring and adds the Gruber reference at diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481. Registry still identifies Ortiz-Villavicencio & Sant'Anna as the primary implementation source at docs/methodology/REGISTRY.md:L1696-L1705.

Concrete fix: No action required.

Code Quality

No P0/P1/P2 findings.

Pattern-wide grep for inline inference anti-patterns (t_stat = ... / se, if se > 0 else 0.0, compute_p_value, compute_confidence_interval) found no new affected Python hunks. The only Python changes are docstrings.

Performance

No findings.

The PR is documentation and Sphinx-template focused; no runtime estimator path changes were introduced.

Maintainability

No P0/P1 findings.

Severity: P3-informational
Impact: The autosummary template/stub refresh addresses tracked Sphinx duplicate/stale-stub debt in TODO.md:L134-L139. Static checks found all generated class stubs have :no-members:, all generated function stubs have :no-index:, and no generated stub still has .. automethod:: __init__.

Concrete fix: No action required.

Tech Debt

No unmitigated blocker-level tech debt found.

The docs-warning cleanup is already covered by TODO.md:L134-L139. The remaining item below is documentation inventory, not statistical correctness.

Security

No findings.

Secret-pattern grep over changed non-notebook source/docs found no obvious credentials, private keys, or tokens.

Documentation/Tests

Severity: P2
Impact: Three public plotting helpers now have no indexed Python-domain target. Their autosummary stubs are explicitly :no-index: at docs/api/_autosummary/diff_diff.plot_bacon.rst:L6-L7, docs/api/_autosummary/diff_diff.plot_power_curve.rst:L6-L7, and docs/api/_autosummary/diff_diff.plot_pretrends_power.rst:L6-L7, but the reciprocal page-level sweep found no corresponding indexed .. autofunction:: for those names. Nearby sibling pages document other plotting/power helpers but omit these targets: docs/api/visualization.rst:L8-L80, docs/api/power.rst:L108-L148, and docs/api/pretrends.rst:L89-L101.

Concrete fix: Add indexed page-level directives for the three functions, likely diff_diff.plot_bacon in docs/api/visualization.rst, diff_diff.plot_power_curve in docs/api/power.rst, and diff_diff.plot_pretrends_power in docs/api/pretrends.rst; or selectively omit :no-index: from those three stubs if they are intended to be the canonical targets.

Severity: P3-informational
Impact: I did not run make html; the review environment is read-only, and Sphinx builds write output. Static RST/template/indexing audits were completed instead.

Concrete fix: Rely on CI’s Sphinx build, or run make html in a writable docs environment.

R2 reviewer flagged 3 plotting helpers (plot_bacon, plot_power_curve, plot_pretrends_power) as having :no-index: stubs without page-level indexed replacements. Added page-level .. autofunction:: directives in visualization.rst, power.rst, and pretrends.rst. Proactive sweep (per feedback_ci_reviewer_pattern_checks) surfaced 7 sibling dataset loaders (load_card_krueger, load_castle_doctrine, load_divorce_laws, load_mpdta, load_dataset, list_datasets, clear_cache) with the same class of issue: docs/api/datasets.rst used bare names under .. module:: diff_diff.datasets, registering them at diff_diff.datasets.X instead of the canonical diff_diff.X export FQN. Removed the .. module:: directive and switched to fully-qualified diff_diff.X names so they appear at the same FQN as their autosummary listings. Verified: full coverage sweep finds 51/51 functions and 59/59 classes in py:function / py:class inventory under their canonical diff_diff.X FQN. make html shows 0 ERRORs / 0 WARNINGs / 0 duplicates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-05-10T14:55:04Z

/ai-review

github-actions · 2026-05-10T15:00:16Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 0062778c0d6ce44d7e8dcec03fdd6eada4f609d8

✅ Looks good

Executive Summary

No estimator math, weighting, variance/SE, assumptions, or runtime defaults changed.
Methodology-touching hunks are docstring/reference formatting only and match docs/methodology/REGISTRY.md.
Previous P2 plotting-helper issue is resolved: plot_bacon, plot_power_curve, and plot_pretrends_power now have page-level indexed autofunction targets.
Required static audits passed: sibling-surface mirror, no-index function reciprocal targets, autosummary pattern sweep, inference anti-pattern grep, and workflow-scope check.
Only residual item is P3: TODO.md still appears to list Sphinx docs debt this PR addresses.

Methodology

Severity: P3-informational
Impact: SyntheticDiDResults.in_time_placebo changed docstring spacing only at diff_diff/results.py:L1271-L1281. The documented “infeasibility-only NaN” behavior matches docs/methodology/REGISTRY.md:L1667-L1669.
Concrete fix: No action required.

Severity: P3-informational
Impact: TripleDifference only wires existing references into the docstring at diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481. The primary implementation source and assumption checks remain aligned with docs/methodology/REGISTRY.md:L1696-L1705.
Concrete fix: No action required.

Code Quality

No P0/P1/P2 findings.

Pattern-wide grep for inline inference anti-patterns found no new affected Python hunks. The only Python changes are docstrings.

Performance

No findings.

The PR is documentation/Sphinx infrastructure only; no runtime estimator path changes were introduced.

Maintainability

No P0/P1/P2 findings.

Static mirror checks passed: all class stubs have :no-members:, all function stubs have :no-index:, and no generated stub still has .. automethod:: __init__. The reciprocal function-target audit found every no-index function stub has a page-level autofunction target.

Workflow transitive-dependency audit: no workflow or pytest-selection changes were in the supplied diff. Existing docs workflow paths already include docs, package code, tests/test_doc_snippets.py, tests/conftest.py, pyproject.toml, and the workflow itself at .github/workflows/docs-tests.yml:L6-L26.

Tech Debt

Severity: P3-informational
Impact: TODO.md:L134-L139 still lists Sphinx duplicate/stale-autosummary/autodoc debt that this PR appears to address. This is not a correctness issue, but it may mislead future review/planning.
Concrete fix: After CI confirms the Sphinx build is warning-free, update or remove the resolved TODO rows.

Security

No findings.

Secret-pattern scan over changed Python/RST/Markdown surfaces, excluding the notebook carve-out paths, found no obvious credentials or private keys.

Documentation/Tests

No P0/P1/P2 findings.

Severity: P3-informational
Impact: Prior plotting-helper inventory gap is resolved at docs/api/visualization.rst:L174-L179, docs/api/power.rst:L150-L155, and docs/api/pretrends.rst:L103-L108.
Concrete fix: No action required.

Severity: P3-informational
Impact: I could not run the doc-snippet tests because pytest is not installed in this review environment. I also did not run a full Sphinx build.
Concrete fix: Rely on CI’s docs build/doc-snippet job for executable validation.

Per PR #410 R3 reviewer P3-informational: TODO.md still listed Sphinx duplicate-object, autodoc-import-failure, EDiDBootstrapResults cross-ref ambiguity, and stale-autosummary-stubs entries that this PR resolves. Removed: - L134 ~1583 duplicate object description Sphinx warnings (now 0) - L137 Sphinx autodoc fails to import DiDResults.ci, MultiPeriodDiDResults.att, CallawaySantAnnaResults.aggregate (typo fixes + template change) - L138 EDiDBootstrapResults cross-reference ambiguous (no "more than one target" warnings remain in build log) - L139 Tracked autosummary stubs stale + 6 untracked (PR 2.5 regenerates all from the new templates and commits the previously-untracked ones) Kept L135 (.txt AI guides outside CI) and L136 (doc-deps.yaml integrity) since neither is in PR 2.5 scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-05-10T15:02:55Z

/ai-review

github-actions · 2026-05-10T15:08:35Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 6a55a9a5e3c893ddb500f7831e9f44ba54290e47

✅ Looks good

Executive Summary

No estimator math, weighting, variance/SE, assumptions, defaults, or runtime behavior changed.
Methodology-adjacent Python hunks are docstring/reference formatting only and align with docs/methodology/REGISTRY.md.
Previous re-review items are addressed: Sphinx TODO debt rows were removed, and plot_bacon, plot_power_curve, plot_pretrends_power now have page-level autofunction targets.
Required audits completed: sibling mirror, pattern-wide inference grep, function/class target reciprocity, and workflow-path transitive check.
No P0/P1 blockers found. One minor docs placement issue remains.

Methodology

No P0/P1/P2 findings.

Severity: P3-informational
Impact: diff_diff/results.py:L1273-L1281 only fixes docstring spacing for SyntheticDiDResults.in_time_placebo; the infeasibility-only NaN note matches docs/methodology/REGISTRY.md:L1667-L1669.
Concrete fix: No action required.

Severity: P3-informational
Impact: diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L476-L481 only add footnote references. The implementation contract remains aligned with docs/methodology/REGISTRY.md:L1696-L1715.
Concrete fix: No action required.

Code Quality

Severity: P3
Impact: git diff --check reports trailing whitespace in generated autosummary stubs, e.g. docs/api/_autosummary/diff_diff.ChaisemartinDHaultfoeuille.rst:L9-L14, likely emitted by docs/_templates/autosummary/class.rst:L8-L17. This is patch hygiene only; it should not affect rendered docs.
Concrete fix: Trim whitespace in the Jinja template and regenerate stubs, or normalize generated stubs before commit.

Performance

No findings. The PR changes documentation infrastructure and docstrings only.

Maintainability

No P0/P1/P2 findings.

Static mirror checks passed: class stubs have :no-members:, function stubs have :no-index:, no generated stub still uses .. automethod:: __init__, and every function stub has a page-level autofunction target. No workflow or pytest-selection files changed; existing docs-tests.yml paths include docs, package code, tests/test_doc_snippets.py, tests/conftest.py, pyproject.toml, and the workflow itself.

Tech Debt

No findings.

The prior Sphinx/autosummary TODO rows are gone; remaining review debt starts at TODO.md:L132-L137 and does not include the resolved duplicate-object/stale-stub items.

Security

No findings.

Secret-pattern scan over changed text surfaces, excluding notebook carve-outs, found no credentials or private keys.

Documentation/Tests

Severity: P2
Impact: The new plot_bacon section was inserted before the existing plot_group_time_heatmap example, so the heatmap example now appears under the plot_bacon heading at docs/api/visualization.rst:L174-L196. This makes the rendered docs misleading even though the cross-reference target exists.
Concrete fix: Move the plot_bacon block after the heatmap example, or add a real plot_bacon example and keep the heatmap example under plot_group_time_heatmap.

Severity: P3-informational
Impact: I did not run Sphinx or pytest in this environment; Sphinx is not installed, and the sandbox is read-only. Static audits were run instead.
Concrete fix: Rely on CI’s docs build/doc-snippet validation for executable coverage.

R3 findings: - P2: plot_bacon section was inserted before plot_group_time_heatmap's Example block, so the heatmap example rendered under the plot_bacon heading. Reordered to put plot_bacon AFTER the heatmap example, and added a real plot_bacon example. - P3: docs/_templates/autosummary/class.rst had {% block %} tags with 3-space indent that produced trailing-whitespace blank lines in every generated stub. Switched the block tags to column 0 (no indent) so Jinja control flow doesn't leak indentation into the rendered output. Regenerated all 50 class stubs - now whitespace-clean. Verified: 0 ERRORs / 0 WARNINGs, 0 stubs with trailing whitespace, git diff --check passes, page-level rendering preserved (286 <dt count on api/results.html, 1 class header on stub). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-05-10T15:13:44Z

/ai-review

github-actions · 2026-05-10T15:18:51Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 10eff92a70089de7985d9af5aa9045c9985e9f07

✅ Looks good

Executive Summary

No estimator math, weighting, variance/SE, identification assumptions, or default behavior changed.
Methodology-adjacent changes are docstring/reference formatting only and align with docs/methodology/REGISTRY.md.
Prior findings are addressed: plot_bacon now has its own section/example, stale Sphinx TODO rows are removed, and generated stubs mirror the new templates.
Required audits completed: sibling-surface mirror, pattern-wide inference grep, reciprocal class/function indexing checks, and docs workflow transitive dependency check.
No P0/P1 blockers found.

Methodology

Severity: P3-informational
Impact: diff_diff/results.py:L1273-L1281 only fixes docstring spacing around SyntheticDiDResults.in_time_placebo(). The statement that NaN is emitted only for dimensional infeasibility matches docs/methodology/REGISTRY.md:L1667-L1669.
Concrete fix: No action required.

Severity: P3-informational
Impact: diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481 only add/attach TripleDifference references. The implementation contract remains aligned with docs/methodology/REGISTRY.md:L1696-L1727.
Concrete fix: No action required.

Code Quality

No P0/P1/P2 findings.

Pattern-wide grep found no new inline inference anti-pattern in modified code. Existing matches are diff_diff/utils.py inside safe_inference() itself and an unchanged permutation-test-specific path in diff_diff/diagnostics.py.

Performance

No findings. The PR changes documentation infrastructure, generated RST stubs, and docstrings only.

Maintainability

No P0/P1/P2 findings.

The autosummary mirror is consistent: class stubs use :no-members: via docs/_templates/autosummary/class.rst:L5-L14, function stubs use :no-index: via docs/_templates/autosummary/function.rst:L5-L6, generated class/function examples reflect those templates at docs/api/_autosummary/diff_diff.DiDResults.rst:L4-L7 and docs/api/_autosummary/diff_diff.compute_power.rst:L4-L7, and no generated stub still uses .. automethod:: __init__.

Reciprocal checks also passed: page-level autoclass directives in docs/api/*.rst retain :no-index:, generated function stubs have page-level autofunction targets, and the prior plot_bacon placement issue is fixed at docs/api/visualization.rst:L167-L210.

Tech Debt

No findings.

The prior Sphinx duplicate/stale-stub TODO rows have been removed; the remaining docs/testing debt starts at TODO.md:L130-L137 and does not include the resolved autosummary issues.

Security

No findings.

Secret-pattern scan over changed text surfaces, excluding notebook carve-outs, found no credentials, tokens, or private keys.

Documentation/Tests

Severity: P3-informational
Impact: I could not run Sphinx or pytest in this environment because neither sphinx nor pytest is installed. Static audits covered the changed RST/Python surfaces, template/stub symmetry, broken .md :doc: refs, orphan benchmark citations, and docs workflow transitive paths.
Concrete fix: Rely on CI’s Sphinx/doc-snippet jobs for executable validation.

CI's "Validate RST code snippets" job failed with 2 errors after the plot_bacon section was inserted in docs/api/visualization.rst: 1. block8 (new plot_bacon example) used wrong API: BaconDecomposition.fit() takes first_treat=, not treatment=. Fixed to first_treat='first_treat'. 2. block9 (Plotly Backend, was block8 before insertion) raised NameError on 'results' because it relies on context from earlier blocks. The _CONTEXT_DEPENDENT_SNIPPETS allowlist had api_visualization:block8 which now points to my new plot_bacon block (which is standalone and does NOT need leniency); the actual context-dependent block moved to index 9. Updated allowlist entry to api_visualization:block9. Verified locally: both blocks now pass under PYTHONPATH=. DIFF_DIFF_BACKEND=python pytest Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-05-10T16:46:37Z

/ai-review

github-actions · 2026-05-10T16:52:05Z

🔁 AI review rerun (requested by @igerber)

Head SHA: e25445e87c2bd46182daa183f707490c8413870c

✅ Looks good

Executive Summary

No estimator math, weighting, variance/SE, assumptions, or defaults changed.
Affected methodology surfaces are docstrings only: SyntheticDiDResults.in_time_placebo() and TripleDifference.
Prior issues appear resolved: plot_bacon is documented, stale Sphinx TODO rows are removed, and generated autosummary stubs mirror the new templates.
Required audits completed: sibling-surface mirror, pattern-wide inference grep, reciprocal class/function indexing checks, and docs workflow transitive dependency check.
No unmitigated P0/P1 findings found.

Methodology

Severity: P3-informational
Impact: diff_diff/results.py:L1279-L1281 only clarifies SyntheticDiDResults.in_time_placebo() documentation. The statement matches docs/methodology/REGISTRY.md:L1667-L1669.
Concrete fix: No action required.

Severity: P3-informational
Impact: diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481 add/attach DDD references. The implementation contract remains aligned with docs/methodology/REGISTRY.md:L1696-L1727 and SE summary at docs/methodology/REGISTRY.md:L2885-L2886.
Concrete fix: No action required.

Code Quality

No findings.

Pattern-wide grep found no new inline inference computation in modified code. Existing matches are diff_diff/utils.py:L208 inside safe_inference() itself and diff_diff/diagnostics.py:L635, an unchanged permutation-test-specific path.

Performance

No findings. The PR changes documentation infrastructure, generated RST stubs, and docstrings only.

Maintainability

No findings.

Autosummary symmetry checks passed: class stubs use :no-members: via docs/_templates/autosummary/class.rst:L5-L14, function stubs use :no-index: via docs/_templates/autosummary/function.rst:L5-L6, all 111 docs/api/index.rst autosummary entries have generated stubs, no class stub is missing :no-members:, no function stub is missing :no-index:, and no generated stub still contains automethod:: __init__.

Reciprocal indexing checks passed: page-level autoclass directives in docs/api/*.rst all retain :no-index:, page-level duplicate autofunction targets were not found, and no page-level .. module:: target collides with a page-level .. autofunction:: target.

Tech Debt

No findings.

The stale Sphinx TODO rows are gone; the remaining Testing/Docs debt starts at TODO.md:L130-L137 and does not include the resolved autosummary duplicate/stale-stub issues.

Security

No findings.

Secret-pattern scan over changed text surfaces, excluding the explicitly carved-out notebook paths, found no credentials, tokens, or private keys.

Documentation/Tests

Severity: P3-informational
Impact: I could not execute Sphinx or pytest locally because sphinx, pytest, and numpy are not installed in this environment. Static audits covered template/stub symmetry, page-level indexing, broken .md :doc: patterns, orphan citation cleanup, snippet block renumbering, and docs workflow transitive dependencies.
Concrete fix: Rely on CI for executable Sphinx/doc-snippet validation.

The plot_bacon prior finding is addressed at docs/api/visualization.rst:L191-L210, with the context-dependent snippet skip moved to the new Plotly block at tests/test_doc_snippets.py:L356-L362. The docs workflow includes docs/**, diff_diff/**, tests/test_doc_snippets.py, tests/conftest.py, pyproject.toml, and the workflow file itself at .github/workflows/docs-tests.yml:L6-L26, covering the relevant transitive inputs.

Adds a sphinx-build job to .github/workflows/docs-tests.yml that runs make -C docs html SPHINXOPTS="-W" so warnings become build failures. This locks in the 0-warning state PR #410 achieved and blocks any future warning from sneaking in. The job mirrors the RTD build (.readthedocs.yaml): Python 3.11, pandoc apt package for nbsphinx, and the same docs extras pinned in pyproject.toml [project.optional-dependencies.docs]. Also adds tutorials/21_had_pretest_workflow to the Tutorials: Business Applications toctree at docs/index.rst:84. PR #409 landed this notebook without a toctree entry; -W enforcement caught it on the local smoke test before push. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber added the ready-for-ci Triggers CI test workflows label May 10, 2026

igerber merged commit bed7fbe into main May 10, 2026
23 checks passed

igerber deleted the docs/sphinx-member-duplicate-fix branch May 10, 2026 18:03

igerber mentioned this pull request May 10, 2026

docs: enforce Sphinx -W in CI (Sphinx cleanup PR 3) #413

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: custom autosummary template + residual triage (PR 2.5)#410

docs: custom autosummary template + residual triage (PR 2.5)#410
igerber merged 6 commits intomainfrom
docs/sphinx-member-duplicate-fix

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented May 10, 2026

Summary

Architecture

Spike-before-bulk discipline

Constructor visibility note

Residual triage (Cat A–E)

Page-vs-page function fix (Cat F surprise)

Methodology references

Validation

Security / privacy

Uh oh!

github-actions Bot commented May 10, 2026

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

✅ Looks good

Executive Summary