Deck review tooling, stronger search ranking, light-deck default, and math rendering by JE-Chen · Pull Request #18 · Integration-Automation/ThesisAgent

JE-Chen · 2026-06-14T17:21:40Z

Merges the accumulated dev work into main. Headline changes, grouped by area.

Search

Relevance ranking now does conservative English stemming, a phrase-adjacency bonus, a small acronym/expansion synonym map, and CJK bigrams. "transformer" matches "transformers", multi-word queries reward adjacency, "llm" matches "large language model", and Chinese / Japanese / Korean queries get a real relevance signal (previously none). A golden-query regression set pins relevance dominance over high-citation off-topic papers.
Automatic paper-research workflow improvements.

Deck review (new)

review_deck bundles three audits into one call — slide overflow, the dark-mode / no-red / contrast colour contracts, and paper_rule seven-section completeness. Exposed as the CLI python -m thesisagents review <deck.pptx> [--lang] [--json] and the MCP pptx_review tool.
The overflow and contrast checks moved into the package (exporters/overflow.py, exporters/audit.py). The old scripts/ entry points are now thin re-export wrappers, so existing imports keep working.

Deck generation

The light navy-band deck is now the default. Dark mode is opt-in via --dark-mode / dark_mode=True.
Inline math renders as real subscripts / superscripts, including in KPI values and table cells.
The over-cap "(+N more)" truncation marker is localised across all 14 locales. Evaluation, limitations, and pain-point sections that exceed the per-cell bullet cap now paginate full-width instead of dropping content.
Thesis-defence deck authoring support.

Rules and docs

Expanded the deck-design / slide-deck / paper-writing subagent rules (plain-language comprehensibility, math rendering, deck length, figure design, structural slides, colour accessibility).
Refreshed the README / MCP / CLI / Sphinx docs, including fixing the stale dark_mode default (it is off, not on).

Quality

Resolved SonarCloud findings, reduced test duplication, and added regression tests across ranking, review, overflow, contrast, i18n, and content preservation.

Verification

pytest (639 passing), ruff check ., and bandit -c pyproject.toml -r thesisagents are all clean locally.

Note: merging to main triggers the release workflow.

Fill real, high-value gaps in the authoring rule docs (each with the project's required Why + example + anti-pattern format): - paper_rule: add "Verb tense" — the per-section tense map (abstract / related-work / method / experiment / conclusion / future-work). A wrong tense is a top "this wasn't written by a researcher" tell, and it was only covered by one Abstract sentence before. - paper_rule: add "Reporting numbers and statistics" — significant figures, percentage-point vs relative %, p-value format, uncertainty, unit consistency. The no-fabrication rule governed whether a number is real; this governs how a real number is written. - slide-deck-rules: add Section 9 (one message per slide; assertion title + evidence body — the biggest "designed for a defence vs paper dumped onto slides" lever) and Section 10 (choose chart vs table vs KPI vs bullets to fit the data). - paper-summary-author: add a field-content quality bar that lands those rules on concrete PaperSummary fields, closing the loop at authoring time. Docs only; tests/test_agents_md.py passes.

Two real gaps in the slide visual-identity doc: - Figures & charts: the exporter inserts figures as PNGs (it draws no native charts), so figure quality is an authoring concern. Add dark-mode adaptation (transparent background — a white PNG on the #12151B slide is the figure version of rgb=None text), chartjunk stripping, brand-palette series, projector-readable label sizes, print DPI, and "re-plot beats screenshot". - Visual hierarchy & focal point: one focal element per slide, hierarchy by size (title > headline number > evidence > caption), whitespace, reading order — the visual rendering of slide-deck-rules Section 9's "one takeaway". Docs only; tests/test_agents_md.py passes.

Two more real ppt-agent gaps: - slide-deck-rules: add Section 11 (structural slides). The exporter renders cover / agenda / section-divider / Q&A / references, but only their *visual* accent was documented, not their job. Each structural slide has one navigational role: cover = title (not the raw query) + authors/year/venue; agenda only for multi-paper decks (pointers, not abstracts); divider = a cognitive reset, name+number only; Q&A = minimal, not a second conclusion; references = only the works actually cited, numbered, split on overflow — not a BibTeX dump. This is where "a paper dumped onto slides" leaks back in. - deck-design (Figures): don't encode meaning by colour alone — teal vs navy is hard for colour-blind viewers and indistinguishable in greyscale; encode twice (colour + marker shape / line style / direct label). Docs only; tests/test_agents_md.py passes.

The last two ppt-agent gaps with real deck-building value: - Section 12 (Math notation rendering): Section 8 says to *gloss* a symbol; this says how to *render* it. The exporter flattens everything to ASCII ("za" not z-subscript-a). Rule: real subscripts/superscripts (python-pptx baseline shift), italic variables + upright operators, Unicode math symbols not ASCII stand-ins, complex formulae as transparent-bg LaTeX PNGs (per the Figures dark-mode rule), one notation per concept deck-wide. - Section 13 (Deck length and pacing): max_slides_per_paper (25) is a talk-time budget (~1-1.5 min/slide -> ~20-30 min). Prune to takeaways rather than cramming past the per-slide caps; a multi-paper survey divides the budget; structural slides count but aren't content. Docs only; tests/test_agents_md.py passes.

…k-rules §12) The exporter flattened math to ASCII ("za", not z-subscript-a). Add inline math rendering: authoring marks math with $...$; inside it `_x` / `_{xy}` become a real subscript and `^x` / `^{xy}` a superscript (via the run's OOXML `baseline` attribute, since python-pptx has no Font.subscript), single-letter tokens (variables z / λ / I) are italicised while multi-letter operators (min / log) stay upright. Plain `_` outside $...$ (file names, prose) is left untouched. - _render_math_paragraph + helpers; every run sets an explicit colour (dark-mode contract — a None-coloured run renders black on the dark slide). - Wired into _add_bullet_box, where math most often appears. A plain bullet still renders as one run exactly as before, so existing decks are unchanged. - 6 unit tests (subscript / superscript / braced / italic-variable-vs-upright- operator / plain-text / bullet integration). 598 tests pass; ruff + bandit clean.

Factor _append_math_runs out of _render_math_paragraph (the non-clearing core that appends math-aware runs to a paragraph already holding runs) and route two more text surfaces through it: - KPI value run: "$λ_max$=0.1" now renders a real subscript, while a plain "78% F1" stays one upright run (no $, so the "F" isn't italic-ised, and the label / value / baseline three-run structure is unchanged). - Table cells (_style_table_cell): a comparison-table cell like "$z_a$" or "O($n^2$)" renders real sub/superscripts. Style-by-position (header bold, data _BRAND_DARK, row-label column heavier) is preserved, \n-split keeps a multi-line cell's paragraphs, and one helper call replaces cell.text plus a font loop. 3 unit tests (append preserves existing runs + renders math; plain KPI value is one upright run; table cell subscript). 601 tests pass; ruff + bandit clean.

…viour change) The dark-mode post-pass inlined two idioms several times each; give them a name so the value/expression lives in one place: - _rgb_key(rgb): the (int(rgb[0]), int(rgb[1]), int(rgb[2])) tuple used as the light->dark map key — was inlined 3x across _swap_fill / _swap_text_colors. - _rgb_hex(rgb): the "%02X%02X%02X" srgbClr `val` string — was inlined in the cell-border drawer and the border-recolour pass. - _DARK_BODY_TEXT: the #E5E7EB near-white dark-mode body colour, promoted from a per-call local in _swap_text_colors to a module constant beside _DARK_SLIDE_BG (single source of truth, matching the _BRAND_* / _LIGHT_TO_DARK_TEXT style). Pure refactor — 601 tests pass unchanged (incl. the dark-mode contract, no-invisible-runs, and no-red-text regressions that exercise these passes); ruff + bandit clean.

…nctuation rule) CLAUDE.md mandates ，/, over ；/; for clause-joining in the rule base — comma- joined clauses scan faster than semicolon-stacked compounds, and mixed semicolon use makes the rule base read unevenly. paper_rule.md carried 31 full-width ； in its Chinese rule prose, audit checklists, and table cells; each becomes ，. Kept on purpose: the three ； inside the "可寫：『…』" paper-writing SAMPLES in the no-fabrication section — those demonstrate real thesis prose (where ； is a legitimate academic separator), not rule text, so changing them would misrepresent the sample. Math notation (I(za;zb)) and APA citation grouping use half-width ; and were never in scope. No rule meaning changes. Other audited dimensions were already correct: no stale package name (autopapertoppt → thesisagents fully migrated), the PaperSummary→thesis-section mapping table matches CLAUDE.md exactly, and the seven-section skeleton is complete.

…-defence authoring - Flip ExportOptions.dark_mode default to False: the project deck is now the light navy-band style (white slides, navy header band, navy cover), with --dark-mode / GUI checkbox opting into the dark palette. - Render inline $...$ math as real subscripts/superscripts in pptx. - Add the thesis-deck-author task agent for oral-defence decks built from the candidate's own thesis, plus post-author-audit math/metadata passes. - Add dark-text and overflow audit scripts with regression tests. - Sync rule docs, CLI/GUI/MCP surfaces, and Sphinx docs to the new default.

First-use glossing made each term decodable but not the whole argument; a reader could parse every word and still miss the point. Add an argument-level comprehensibility rule so a non-expert (adjacent-discipline committee member, skimming reviewer, undergraduate) can grasp what each section/slide claims, roughly how, and why it matters. - paper_rule: new authoritative bilingual HARD section (term-level -> argument-level), with intuition-before-formalism, plain per-section "so what", real-world number anchors, one-analogy, and a cross-department self-test. - slide-deck-rules: new section 14 (slide implementation) + forward-ref from section 9. - deck-design: visual-side subsection + anti-pattern bullet. - paper-summary-author / thesis-deck-author: authoring-time bullet. - post-author-audit: Audit 5 (judgement scan) + reporting wiring. - CLAUDE.md: wired into the context-clear + detail-explained governance. Additive to depth, never a dumbing-down; enforced by Audit 5.

Relevance was exact lowercase-token overlap, so "transformer" missed "transformers", multi-word queries got no adjacency credit, acronyms never matched their expansions, and CJK queries got no signal at all. - conservative English stemming, min-stem >= 4 guards over-stripping - adjacency bonus so a query's phrase outranks scattered terms - small acronym <-> expansion synonym map (llm, rag, gnn, ...), expanding documents not the query so the relevance denominator stays honest - CJK character bigrams so Chinese / Japanese / Korean queries rank too Adds unit tests plus a golden-query regression set that pins relevance dominance over a 180k-citation off-topic paper.

Overflow and colour-contract checks lived in two scripts and the seven-section judgement was human-only. Move both checks into the package (exporters/overflow.py, exporters/audit.py; the scripts become thin re-export wrappers so existing imports keep working) and add exporters/review.py to bundle them with a paper_rule section-completeness check that reuses the exporter's own slide classifier. Exposed as the CLI `python -m thesisagents review <deck.pptx> [--lang] [--json]` and the MCP `pptx_review` tool. Completeness only fails a thesis-style deck, and references is gated only for multi-paper decks (a single-paper rich deck folds references into the cover, an own-thesis deck omits self-citation).

The "(+N more)" over-cap marker was hard-coded English, wrong on a zh-tw / ja deck. Add a more_items key across all 14 locales and thread the deck language into _cap_bullets and its multi-column callers. When an evaluation or limitations section exceeds the per-cell bullet cap, render it full-width paginated (at most cap bullets per page) instead of dropping the overflow behind the marker, so no author content is lost.

Update the overflow / contrast subagent docs to reference the package modules and review_deck, and add pptx_review and the review subcommand to README, the MCP and CLI references, and the en / zh-tw / zh-cn Sphinx docs.

The light navy-band deck became the default in 36b0ed3, but several docs still said the export dark_mode option defaults to true. Correct the README, architecture, MCP and Sphinx references to state false (light), with dark as opt-in.

- a pain point with more bullets than a quadrant cell now renders full-width paginated (like evaluation / limitations), with the research-question callout moved to its own lead slide, so neither a bullet nor the RQ is lost - `thesisagents review -h/--help` prints usage instead of treating the flag as a deck path Adds content-preservation tests for the pain-point and future-work paths.

sonarqubecloud · 2026-06-14T17:22:41Z

Quality Gate passed

Issues
8 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

JE-Chen added 16 commits June 6, 2026 22:10

Point the deck-audit docs at the unified review entry point

614db13

Update the overflow / contrast subagent docs to reference the package modules and review_deck, and add pptx_review and the review subcommand to README, the MCP and CLI references, and the en / zh-tw / zh-cn Sphinx docs.

JE-Chen merged commit 152a725 into main Jun 14, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deck review tooling, stronger search ranking, light-deck default, and math rendering#18

Deck review tooling, stronger search ranking, light-deck default, and math rendering#18
JE-Chen merged 16 commits into
mainfrom
dev

JE-Chen commented Jun 14, 2026

Uh oh!

sonarqubecloud Bot commented Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JE-Chen commented Jun 14, 2026

Search

Deck review (new)

Deck generation

Rules and docs

Quality

Verification

Uh oh!

sonarqubecloud Bot commented Jun 14, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant