Skip to content

Deck review tooling, stronger search ranking, light-deck default, and math rendering#18

Merged
JE-Chen merged 16 commits into
mainfrom
dev
Jun 14, 2026
Merged

Deck review tooling, stronger search ranking, light-deck default, and math rendering#18
JE-Chen merged 16 commits into
mainfrom
dev

Conversation

@JE-Chen

@JE-Chen JE-Chen commented Jun 14, 2026

Copy link
Copy Markdown
Member

Merges the accumulated dev work into main. Headline changes, grouped by area.

Search

  • Relevance ranking now does conservative English stemming, a phrase-adjacency bonus, a small acronym/expansion synonym map, and CJK bigrams. "transformer" matches "transformers", multi-word queries reward adjacency, "llm" matches "large language model", and Chinese / Japanese / Korean queries get a real relevance signal (previously none). A golden-query regression set pins relevance dominance over high-citation off-topic papers.
  • Automatic paper-research workflow improvements.

Deck review (new)

  • review_deck bundles three audits into one call — slide overflow, the dark-mode / no-red / contrast colour contracts, and paper_rule seven-section completeness. Exposed as the CLI python -m thesisagents review <deck.pptx> [--lang] [--json] and the MCP pptx_review tool.
  • The overflow and contrast checks moved into the package (exporters/overflow.py, exporters/audit.py). The old scripts/ entry points are now thin re-export wrappers, so existing imports keep working.

Deck generation

  • The light navy-band deck is now the default. Dark mode is opt-in via --dark-mode / dark_mode=True.
  • Inline math renders as real subscripts / superscripts, including in KPI values and table cells.
  • The over-cap "(+N more)" truncation marker is localised across all 14 locales. Evaluation, limitations, and pain-point sections that exceed the per-cell bullet cap now paginate full-width instead of dropping content.
  • Thesis-defence deck authoring support.

Rules and docs

  • Expanded the deck-design / slide-deck / paper-writing subagent rules (plain-language comprehensibility, math rendering, deck length, figure design, structural slides, colour accessibility).
  • Refreshed the README / MCP / CLI / Sphinx docs, including fixing the stale dark_mode default (it is off, not on).

Quality

  • Resolved SonarCloud findings, reduced test duplication, and added regression tests across ranking, review, overflow, contrast, i18n, and content preservation.

Verification

  • pytest (639 passing), ruff check ., and bandit -c pyproject.toml -r thesisagents are all clean locally.

Note: merging to main triggers the release workflow.

JE-Chen added 16 commits June 6, 2026 22:10
Fill real, high-value gaps in the authoring rule docs (each with the project's
required Why + example + anti-pattern format):

- paper_rule: add "Verb tense" — the per-section tense map (abstract /
  related-work / method / experiment / conclusion / future-work). A wrong tense
  is a top "this wasn't written by a researcher" tell, and it was only covered
  by one Abstract sentence before.
- paper_rule: add "Reporting numbers and statistics" — significant figures,
  percentage-point vs relative %, p-value format, uncertainty, unit
  consistency. The no-fabrication rule governed whether a number is real; this
  governs how a real number is written.
- slide-deck-rules: add Section 9 (one message per slide; assertion title +
  evidence body — the biggest "designed for a defence vs paper dumped onto
  slides" lever) and Section 10 (choose chart vs table vs KPI vs bullets to fit
  the data).
- paper-summary-author: add a field-content quality bar that lands those rules
  on concrete PaperSummary fields, closing the loop at authoring time.

Docs only; tests/test_agents_md.py passes.
Two real gaps in the slide visual-identity doc:

- Figures & charts: the exporter inserts figures as PNGs (it draws no native
  charts), so figure quality is an authoring concern. Add dark-mode adaptation
  (transparent background — a white PNG on the #12151B slide is the figure
  version of rgb=None text), chartjunk stripping, brand-palette series,
  projector-readable label sizes, print DPI, and "re-plot beats screenshot".
- Visual hierarchy & focal point: one focal element per slide, hierarchy by
  size (title > headline number > evidence > caption), whitespace, reading
  order — the visual rendering of slide-deck-rules Section 9's "one takeaway".

Docs only; tests/test_agents_md.py passes.
Two more real ppt-agent gaps:

- slide-deck-rules: add Section 11 (structural slides). The exporter renders
  cover / agenda / section-divider / Q&A / references, but only their *visual*
  accent was documented, not their job. Each structural slide has one
  navigational role: cover = title (not the raw query) + authors/year/venue;
  agenda only for multi-paper decks (pointers, not abstracts); divider = a
  cognitive reset, name+number only; Q&A = minimal, not a second conclusion;
  references = only the works actually cited, numbered, split on overflow —
  not a BibTeX dump. This is where "a paper dumped onto slides" leaks back in.
- deck-design (Figures): don't encode meaning by colour alone — teal vs navy
  is hard for colour-blind viewers and indistinguishable in greyscale; encode
  twice (colour + marker shape / line style / direct label).

Docs only; tests/test_agents_md.py passes.
The last two ppt-agent gaps with real deck-building value:

- Section 12 (Math notation rendering): Section 8 says to *gloss* a symbol;
  this says how to *render* it. The exporter flattens everything to ASCII
  ("za" not z-subscript-a). Rule: real subscripts/superscripts (python-pptx
  baseline shift), italic variables + upright operators, Unicode math symbols
  not ASCII stand-ins, complex formulae as transparent-bg LaTeX PNGs (per the
  Figures dark-mode rule), one notation per concept deck-wide.
- Section 13 (Deck length and pacing): max_slides_per_paper (25) is a
  talk-time budget (~1-1.5 min/slide -> ~20-30 min). Prune to takeaways
  rather than cramming past the per-slide caps; a multi-paper survey divides
  the budget; structural slides count but aren't content.

Docs only; tests/test_agents_md.py passes.
…k-rules §12)

The exporter flattened math to ASCII ("za", not z-subscript-a). Add inline math
rendering: authoring marks math with $...$; inside it `_x` / `_{xy}` become a
real subscript and `^x` / `^{xy}` a superscript (via the run's OOXML `baseline`
attribute, since python-pptx has no Font.subscript), single-letter tokens
(variables z / λ / I) are italicised while multi-letter operators (min / log)
stay upright. Plain `_` outside $...$ (file names, prose) is left untouched.

- _render_math_paragraph + helpers; every run sets an explicit colour
  (dark-mode contract — a None-coloured run renders black on the dark slide).
- Wired into _add_bullet_box, where math most often appears. A plain bullet
  still renders as one run exactly as before, so existing decks are unchanged.
- 6 unit tests (subscript / superscript / braced / italic-variable-vs-upright-
  operator / plain-text / bullet integration). 598 tests pass; ruff + bandit
  clean.
Factor _append_math_runs out of _render_math_paragraph (the non-clearing core
that appends math-aware runs to a paragraph already holding runs) and route two
more text surfaces through it:

- KPI value run: "$λ_max$=0.1" now renders a real subscript, while a plain
  "78% F1" stays one upright run (no $, so the "F" isn't italic-ised, and the
  label / value / baseline three-run structure is unchanged).
- Table cells (_style_table_cell): a comparison-table cell like "$z_a$" or
  "O($n^2$)" renders real sub/superscripts. Style-by-position (header bold,
  data _BRAND_DARK, row-label column heavier) is preserved, \n-split keeps a
  multi-line cell's paragraphs, and one helper call replaces cell.text plus a
  font loop.

3 unit tests (append preserves existing runs + renders math; plain KPI value is
one upright run; table cell subscript). 601 tests pass; ruff + bandit clean.
…viour change)

The dark-mode post-pass inlined two idioms several times each; give them a name
so the value/expression lives in one place:

- _rgb_key(rgb): the (int(rgb[0]), int(rgb[1]), int(rgb[2])) tuple used as the
  light->dark map key — was inlined 3x across _swap_fill / _swap_text_colors.
- _rgb_hex(rgb): the "%02X%02X%02X" srgbClr `val` string — was inlined in the
  cell-border drawer and the border-recolour pass.
- _DARK_BODY_TEXT: the #E5E7EB near-white dark-mode body colour, promoted from a
  per-call local in _swap_text_colors to a module constant beside _DARK_SLIDE_BG
  (single source of truth, matching the _BRAND_* / _LIGHT_TO_DARK_TEXT style).

Pure refactor — 601 tests pass unchanged (incl. the dark-mode contract,
no-invisible-runs, and no-red-text regressions that exercise these passes);
ruff + bandit clean.
…nctuation rule)

CLAUDE.md mandates ,/, over ;/; for clause-joining in the rule base — comma-
joined clauses scan faster than semicolon-stacked compounds, and mixed semicolon
use makes the rule base read unevenly. paper_rule.md carried 31 full-width ; in
its Chinese rule prose, audit checklists, and table cells; each becomes ,.

Kept on purpose: the three ; inside the "可寫:『…』" paper-writing SAMPLES in
the no-fabrication section — those demonstrate real thesis prose (where ; is a
legitimate academic separator), not rule text, so changing them would misrepresent
the sample. Math notation (I(za;zb)) and APA citation grouping use half-width ;
and were never in scope. No rule meaning changes.

Other audited dimensions were already correct: no stale package name (autopapertoppt
→ thesisagents fully migrated), the PaperSummary→thesis-section mapping table
matches CLAUDE.md exactly, and the seven-section skeleton is complete.
…-defence authoring

- Flip ExportOptions.dark_mode default to False: the project deck is now
  the light navy-band style (white slides, navy header band, navy cover),
  with --dark-mode / GUI checkbox opting into the dark palette.
- Render inline $...$ math as real subscripts/superscripts in pptx.
- Add the thesis-deck-author task agent for oral-defence decks built from
  the candidate's own thesis, plus post-author-audit math/metadata passes.
- Add dark-text and overflow audit scripts with regression tests.
- Sync rule docs, CLI/GUI/MCP surfaces, and Sphinx docs to the new default.
First-use glossing made each term decodable but not the whole argument; a
reader could parse every word and still miss the point. Add an
argument-level comprehensibility rule so a non-expert (adjacent-discipline
committee member, skimming reviewer, undergraduate) can grasp what each
section/slide claims, roughly how, and why it matters.

- paper_rule: new authoritative bilingual HARD section (term-level ->
  argument-level), with intuition-before-formalism, plain per-section "so
  what", real-world number anchors, one-analogy, and a cross-department
  self-test.
- slide-deck-rules: new section 14 (slide implementation) + forward-ref
  from section 9.
- deck-design: visual-side subsection + anti-pattern bullet.
- paper-summary-author / thesis-deck-author: authoring-time bullet.
- post-author-audit: Audit 5 (judgement scan) + reporting wiring.
- CLAUDE.md: wired into the context-clear + detail-explained governance.

Additive to depth, never a dumbing-down; enforced by Audit 5.
Relevance was exact lowercase-token overlap, so "transformer" missed
"transformers", multi-word queries got no adjacency credit, acronyms
never matched their expansions, and CJK queries got no signal at all.

- conservative English stemming, min-stem >= 4 guards over-stripping
- adjacency bonus so a query's phrase outranks scattered terms
- small acronym <-> expansion synonym map (llm, rag, gnn, ...), expanding
  documents not the query so the relevance denominator stays honest
- CJK character bigrams so Chinese / Japanese / Korean queries rank too

Adds unit tests plus a golden-query regression set that pins relevance
dominance over a 180k-citation off-topic paper.
Overflow and colour-contract checks lived in two scripts and the
seven-section judgement was human-only. Move both checks into the package
(exporters/overflow.py, exporters/audit.py; the scripts become thin
re-export wrappers so existing imports keep working) and add
exporters/review.py to bundle them with a paper_rule section-completeness
check that reuses the exporter's own slide classifier.

Exposed as the CLI `python -m thesisagents review <deck.pptx> [--lang] [--json]`
and the MCP `pptx_review` tool. Completeness only fails a thesis-style
deck, and references is gated only for multi-paper decks (a single-paper
rich deck folds references into the cover, an own-thesis deck omits
self-citation).
The "(+N more)" over-cap marker was hard-coded English, wrong on a
zh-tw / ja deck. Add a more_items key across all 14 locales and thread the
deck language into _cap_bullets and its multi-column callers.

When an evaluation or limitations section exceeds the per-cell bullet cap,
render it full-width paginated (at most cap bullets per page) instead of
dropping the overflow behind the marker, so no author content is lost.
Update the overflow / contrast subagent docs to reference the package
modules and review_deck, and add pptx_review and the review subcommand to
README, the MCP and CLI references, and the en / zh-tw / zh-cn Sphinx docs.
The light navy-band deck became the default in 36b0ed3, but several docs
still said the export dark_mode option defaults to true. Correct the
README, architecture, MCP and Sphinx references to state false (light),
with dark as opt-in.
- a pain point with more bullets than a quadrant cell now renders
  full-width paginated (like evaluation / limitations), with the
  research-question callout moved to its own lead slide, so neither a
  bullet nor the RQ is lost
- `thesisagents review -h/--help` prints usage instead of treating the
  flag as a deck path

Adds content-preservation tests for the pain-point and future-work paths.
@sonarqubecloud

Copy link
Copy Markdown

@JE-Chen JE-Chen merged commit 152a725 into main Jun 14, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant