Reductions: partition-based updates + EHA reductions chapter by adibender · Pull Request #156 · mlsa-book/MLSA

adibender · 2026-03-08T17:54:39Z

Summary

Updated partition-based reductions chapter (P4C22) with expanded content
Added new EHA reductions chapter (P4C23) covering competing risks reduction framework
Minor edits to survival chapter (P1C4), quarto config, and bibliography

Replaces #153 with a clean history (single commit on main).

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

…to redux-comments

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

…to redux-comments

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

…to redux-comments

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

- Update partition-based reductions (P4C22) with expanded content - Add EHA reductions chapter (P4C23) with competing risks framework, PAM and RSF examples on sir.adm data - Add CIF comparison figures for PAM vs RSF - Update survival chapter (P1C4) with minor edits - Update quarto config and bibliography

github-actions · 2026-03-08T17:59:33Z

Book 📖

P4C19 (Reductions intro): - Hyphenation fixes: one-dimensional, event-history - Grammar fixes: sentence fragment, data are, dataset, only a few, computational cost P4C20 (IPCW Classification): - Chapter title: IPC Weighted → IPCW Classification - Hyphenation: time point, pointwise, well-calibrated, rank-based, gradient-based - Capitalise Bayesian; \mathbb{I} → \II - Loss notation: \mathcal{l} → L (aggregate loss convention); add missing negation to log loss (both pointwise and aggregate forms) - Fix logical description of censored observations (line 36) P4C21 (Pseudo-value Regression): - New section: Choice of Response Function (§17.2), covering post-hoc vs. integrated application of h(·) and interpretation of covariate effects - Table 17.1: common link/response functions with covariate interpretations; notation refined to h⁻¹(ψ) / h(f(x)) with ψ = S(τ|x) in caption - Section moved before examples to avoid forward references - Removed stale forward-reference note and inline RMST explanation (now refs §17.2) - Fix \mathbb{I} → \II (4 occurrences); \xx_i^⊤ notation; h(f(x)) → h(f(xx)) - Correct @eq-cox-ph → @eq-ph; Gamma/Poisson → log link with literature support - data set → dataset; various grammar/hyphenation fixes - Further reading expanded: add foundational papers (Andersen 2003, 2004), eventglm (Sachs 2022), random forests (Mogensen 2013), deep learning (Zhao 2020) P5C24 (Conclusions): - Update Langbein2024 → Langbein2025 (published journal version) library.bib: - Add Sachs2022 (eventglm, JSS), Hothorn2021, royston2011use, tian2014predicting - Remove duplicate Andersen2004 entry (andersen2004regressionanalysis already present)

- Title hyphenation: Partition-Based Reductions - Abstract written (was TODO) - Conclusion section added: Key Takeaways, Limitations, Further Reading callouts - Numerous typo fixes: disjunct→disjoint, occured→occurred, classifcation→classification, inerpreted→interpreted, transfomration→transformation, akward→awkward, accross→across - Consistent hyphenation throughout: discrete-time, partition-based, tree-based, risk-set-based, continuous-time - data set → dataset throughout - \mathbb{I} → \II; \mathbb{E} → E - Citation style fixes: (@...) → [@...]; remove Figure/Table/Section prefix before @-refs - Stray punctuation fixes (extra comma, stray parenthesis) - less rows → fewer rows - Sentence fragment fixed (garbled sentence in left-closed vs left-open section) - Limitation added: reductions apply to right-censored data only; left- and interval-censored data require additional adaptations

adibender · 2026-03-11T13:08:55Z

Summary of changes (beyond #147)

P4C19 – Reductions intro

Minor grammar and hyphenation fixes (one-dimensional, event-history, dataset, data are)

P4C20 – IPCW Classification

Chapter title corrected: IPC Weighted → IPCW Classification
Hyphenation and capitalisation fixes (pointwise, well-calibrated, rank-based, gradient-based, Bayesian)
\mathbb{I} → \II throughout
Loss notation made consistent with rest of book: \mathcal{l} → L for aggregate loss; missing negation added to log loss (both pointwise and aggregate forms)
Logical fix: description of censored observations clarified (neither event nor confirmed non-event)

P4C21 – Pseudo-value Regression

New section added: Choice of Response Function (§17.1), covering:
- Post-hoc vs. integrated application of $h(\cdot)$
- Table of common link functions with covariate interpretations (identity, logit, cloglog, log)
- cloglog link as Cox PH special case and PH diagnostic
- Identity vs. log link for RMST targets (with literature support; unsupported Gamma deviance claim removed)
Section placed before the worked examples to avoid forward references
Table notation refined: $h^{-1}(\psi)$ / $h(f(\mathbf{x}))$ with $\psi = S(\tau|\mathbf{x})$ in caption
\mathbb{I} → \II; @eq-cox-ph → @eq-ph; notation fixes ($\mathbf{x}_i^\top$, $h(f(\mathbf{x}))$)
Further reading expanded: foundational papers (Andersen 2003, 2004), eventglm (Sachs 2022), random forests (Mogensen 2013)
New bib entries: Sachs2022, Hothorn2021, royston2011use, tian2014predicting, Andersen2004 (deduplicated to existing key)

P4C22 – Partition-Based Reductions

Abstract written (was TODO)
Conclusion section added: Key Takeaways, Limitations, Further Reading callouts
New limitation: reductions apply to right-censored data only; left- and interval-censored data require additional adaptations
Consistent hyphenation: discrete-time, partition-based, continuous-time, risk-set-based, tree-based
\mathbb{I} → \II; \mathbb{E} → E; citation style (@...) → [@...]
Removed "Figure/Table/Section" prefixes before @-refs (would double-render)
Typo fixes: disjunct→disjoint, occurred, classification, interpreted, transformation, awkward
data set → dataset throughout; less rows → fewer rows

P5C24 – Conclusions

Citation key updated: Langbein2024 → Langbein2025 (published journal version)

adibender · 2026-03-11T13:12:49Z

@RaphaelS1 Integrated most of your previous suggestions from #147. Added few things, particularly in partition based reduction. I think 15-18 are very good now. 19 needs more work, but will do later

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

RaphaelS1 · 2026-03-13T11:21:17Z

-First, consider a linear model without features, that is $\hat{S}(\tau|\xx_i) = \hat{\beta}_{\tau,0}$. 
-By construction of the pseudo-values, at time point $\tau = 1000$ days we have $\hat{\tilde{\theta}}(1000) = \hat{\beta}_{1000,0} = \frac{1}{n} \sum_{i=1}^n \tilde{\theta}_i(1000) = 0.6175 = \hat{S}_{KM}(1000)$. 
-Of course it doesn't really make sense to estimate $n+1$ Kaplan-Meier curves just to obtain the overall Kaplan-Meier estimate at one time-point, but this example illustrates that pseudo-value based regression provides consistent estimators of the survival probability.
+To guarantee pseudo-values are within the required range, one can then apply the sigmoid function (see @tbl-pseudo-link-interpretation)


The table doesn't state that sigmoid and logit are the inverse and doesn't clearly state in which column is the logit and which is sigmoid

RaphaelS1 · 2026-03-13T12:11:29Z

+**Left-truncation.**
+In the competing risks setting, all subjects usually start in the initial state at time $0$.
+In a multi-state process, subjects enter the risk set for a transition $\ell \to e$ at the time they enter state $\ell$, which can differ across subjects.
+As discussed in @sec-multi-state for the `prothr` data, a subject may only enter an intermediate state (and thus become at risk for transitions from that state) at some later time point.
+This constitutes *internal* or *process-induced* left-truncation and must be handled appropriately by the hazard estimation method.


This needs to be before the pseudo-algorithm above because currently it just throws in left truncation without explanation

RaphaelS1 · 2026-04-10T10:28:32Z

 * @Friedman1982 introduced the piecewise exponential model and established its consistency properties.
 * @Tutz2016 provide a comprehensive treatment of discrete-time survival analysis, covering model specification, estimation, and interpretation in depth.
-* @Bender2018 and @Kopper2022 show how penalized additive models (PAMs) can be used as the base learner in the PEM framework, with the `pammtools` package [@pkgpammtools] providing a convenient implementation.
+* @Bender2018PAM and @Kopper2022 show how penalized additive models (PAMs) can be used as the base learner in the PEM framework, with the `pammtools` package [@pkgpammtools] providing a convenient implementation.


@adibender can you check I updated to the correct citation here

Co-authored-by: Raphael Sonabend-Friend <raphaelsonabend@gmail.com>

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

…into reductions-clean

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

# Conflicts: # book/P5C24_conclusions.qmd # book/_book/Machine-Learning-in-Survival-Analysis.pdf # book/_not_used.qmd

Walkthrough fixes for the 8 review-comment TODOs Raphael left in book/P4C22_partition-based-reduction.qmd: * (Data Transformation) drop meta TODO comment, fix "effect"-> "affect". * (Discrete-time likelihood) rewrite the case-by-case justification of the last equality so it links to the cases on the previous line instead of using the ambiguous "first/second bracket" phrasing. * (Discrete-time summary) fix the interpretation -- a classifier predicts the class probability \hat\pi_{ij}, not a label \hat\delta_{ij}; \hat\pi_{ij} estimates the conditional discrete-time hazard. * (Discrete-time logistic example) drop the implementation-detail parenthetical about reference-coded interval index and fix a stray comma (resolved manually by user). * (Discrete-time logistic example) add an explicit bridge showing \hat h_{tilde Y}(j|x) = sigmoid(linear predictor) via @eq-discrete-hazard-probability before introducing \hat S. * (PEM intro) rewrite to distinguish PEM from the discrete-time approach up front: continuous-time step-function hazard, exact within-interval event times preserved via offset, continuous-time hazard estimates. * (PEM survival formula) show the integral S = exp(-int h du) first, then collapse to a sum because h is piecewise constant. * (PEM) add new subsection "When to use PEM" with pros/cons vs discrete-time / survival stacking; remove the orphan TODO comment. Also tightens an ill-formed set-builder expression in the survival stacking section: \mathcal{A} = {t_{(1)}, ..., t_{(m+1)}: ...} -> chain of strict inequalities, matching the convention in @eq-cut-points. No prose semantics changed elsewhere; all cross-references resolve.

* Restructure CR section into Separate datasets / Stacked dataset / Separate vs. stacked / Application to sir.adm. * Add MS section: transition-specific pipeline + stacked dataset subsections, with new prothr application using a two-stage reduction (multi-state -> transition-specific LT single-event -> Poisson PED) fit via XGBoost and compared against the AJ baseline split by treatment. * Replace older Python infographic generators with drawio HTML+SVG sources for cr-reduction-pipeline, cr-reduction-stacked, ms-reduction-pipeline, and ms-reduction-stacked figures. * Regenerate cif-marg-sir.png; render tp-prothr-cmp.png. * Add Limitations callout (separate-data bookkeeping overhead, specialised learner trade-offs e.g. native CR RSF). * P1C5: prothr state-occupation discussion + companion R scripts and figures, currently parked in _not_used.qmd. * library.bib: new citations (niessl2023, putter2018). * code.R: prothr MS reduction comparison block (AJ + XGBoost-Poisson via pammtools as_ped + xgb.train with offset base_margin trick).

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

RaphaelS1 and others added 15 commits January 30, 2026 08:55

reductions comments

c2beb5d

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Push pdf

59d43ae

c15 comments

9ff954e

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Push pdf

4a39629

add abstract

871b634

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Merge branch 'redux-comments' of https://github.com/mlsa-book/MLSA in…

9b05959

…to redux-comments

Push pdf

523ecb3

ipcw comments

07c7538

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Merge branch 'redux-comments' of https://github.com/mlsa-book/MLSA in…

0f3ef9f

…to redux-comments

Push pdf

40bfa7b

finish review

216cbc1

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Merge branch 'redux-comments' of https://github.com/mlsa-book/MLSA in…

c04dfa4

…to redux-comments

typo

3b7228f

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Push pdf

01fdf49

adibender mentioned this pull request Mar 8, 2026

Reductions #153

Closed

Push pdf

89322f2

adibender added 3 commits March 11, 2026 10:42

Update book/P4C21_pseudo.qmd

9125fbd

Merge redux-comments (PR #147) into reductions-clean

dae90a7

adibender mentioned this pull request Mar 11, 2026

reductions comments #147

Closed

adibender requested a review from RaphaelS1 March 11, 2026 13:11

RaphaelS1 and others added 2 commits March 13, 2026 08:57

Merge branch 'main' into reductions-clean

52c349c

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Push pdf

0cd6815

RaphaelS1 reviewed Mar 13, 2026

View reviewed changes

RaphaelS1 reviewed Apr 10, 2026

View reviewed changes

RaphaelS1 and others added 28 commits April 10, 2026 10:30

Push pdf

8858612

Update book/_quarto.yml

da9477f

Co-authored-by: Raphael Sonabend-Friend <raphaelsonabend@gmail.com>

Update book/P1C4_survival.qmd

c4e857f

Co-authored-by: Raphael Sonabend-Friend <raphaelsonabend@gmail.com>

Update book/P4C19_reductions.qmd

56b3086

Co-authored-by: Raphael Sonabend-Friend <raphaelsonabend@gmail.com>

Update book/P4C20_ipcw.qmd

f21ab4d

Co-authored-by: Raphael Sonabend-Friend <raphaelsonabend@gmail.com>

Update book/P4C21_pseudo.qmd

3615b4a

Co-authored-by: Raphael Sonabend-Friend <raphaelsonabend@gmail.com>

Update book/P4C20_ipcw.qmd

4179884

Co-authored-by: Raphael Sonabend-Friend <raphaelsonabend@gmail.com>

Update book/P4C21_pseudo.qmd

0818f71

Co-authored-by: Raphael Sonabend-Friend <raphaelsonabend@gmail.com>

Address PR #156 inline review comments (P4C20, P4C21, P4C22)

9306286

P4C22: address Category B review comments

6d4497b

P4C23: rewrite with sir.adm worked example (item 25)

ae54c9b

Merge branch 'main' into reductions-clean

2956336

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

ref table

6d1524b

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Address PR #156 inline review comments (P4C21, P4C22)

e4f2016

Merge branch 'main' into reductions-clean

8ecea46

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Merge branch 'reductions-clean' of https://github.com/mlsa-book/MLSA …

771c7d0

…into reductions-clean

minor changes, mostly fixme and todo

be78d1d

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Merge branch 'main' into reductions-clean

91b53db

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Merge branch 'main' into reductions-clean

1720009

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Push pdf

c20214c

Merge remote-tracking branch 'origin/main' into reductions-clean

399c435

# Conflicts: # book/P5C24_conclusions.qmd # book/_book/Machine-Learning-in-Survival-Analysis.pdf # book/_not_used.qmd

Push pdf

94925db

Push pdf

9e6b9c4

Merge branch 'main' into reductions-clean

3cfddbb

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

Push pdf

66e34b6

typos, remove abbreviations, add fixmes

33e1499

Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>

RaphaelS1 merged commit 2fc0ad1 into main May 17, 2026
1 check passed

Conversation

adibender commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

github-actions Bot commented Mar 8, 2026

Uh oh!

adibender commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes (beyond #147)

P4C19 – Reductions intro

P4C20 – IPCW Classification

P4C21 – Pseudo-value Regression

P4C22 – Partition-Based Reductions

P5C24 – Conclusions

Uh oh!

adibender commented Mar 11, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RaphaelS1 Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RaphaelS1 Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RaphaelS1 Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adibender commented Mar 8, 2026 •

edited

Loading

adibender commented Mar 11, 2026 •

edited

Loading