igerber · igerber · May 9, 2026 · May 9, 2026 · May 9, 2026 · May 9, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,6 +8,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 
 ### Added
+- **`ChaisemartinDHaultfoeuille.by_path` + non-binary integer treatment** — `by_path=k` now accepts integer-coded discrete treatment (D in Z, e.g. ordinal `{0, 1, 2}`); path tuples become integer-state tuples like `(0, 2, 2, 2)`. The previous `NotImplementedError` gate at `chaisemartin_dhaultfoeuille.py:1870` is replaced by a `ValueError` for continuous D (e.g. `D=1.5`) at fit-time per the no-silent-failures contract — the existing `int(round(float(v)))` cast in `_enumerate_treatment_paths` is now defensive (no-op for integer-coded D). Validated against R `did_multiplegt_dyn(..., by_path)` for D in `{0, 1, 2}` via the new `multi_path_reversible_by_path_non_binary` golden-value scenario (78 switchers, 3 paths, single-baseline custom DGP, F_g >= 4): per-path point estimates match R bit-exactly (rtol ~1e-9 on event horizons; rtol+atol envelope for placebo near-zero values), per-path SE inherits the documented cross-path cohort-sharing deviation (~5% rtol observed; SE_RTOL=0.15 envelope). **Deviation from R for D >= 10:** R's `did_multiplegt_by_path` derives the per-path baseline via `path_index$baseline_XX <- substr(path_index$path, 1, 1)`, which captures only the first character of the comma-separated path string (e.g. for `path = "12,12,..."` it captures `"1"` instead of `"12"`); this mis-allocates R's per-path control-pool subset for D >= 10. Python's tuple-key matching is correct in this regime — the per-path point estimates we compute are correct; R's per-path subset for the same path is buggy. The shipped parity scenario stays in `D in {0, 1, 2}` to avoid the R bug. R-parity test at `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityByPathNonBinary`; cross-surface invariants regression-tested at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathNonBinary`.
+- **New `paths_of_interest` kwarg on `ChaisemartinDHaultfoeuille`** for user-specified treatment-path subsets, alternative to `by_path=k`'s top-k automatic ranking. Mutually exclusive with `by_path`; setting both raises `ValueError` at `__init__` and `set_params` time. Each path tuple must be a list/tuple of `int` of length `L_max + 1` (uniformity validated at `__init__`; length match against `L_max + 1` validated at fit-time); `bool` and `np.bool_` are explicitly rejected, `np.integer` accepted and canonicalized to Python `int` for tuple-key consistency. Duplicates emit a `UserWarning` and are deduplicated; paths not observed in the panel emit a `UserWarning` and are omitted from `path_effects`. Paths appear in `results.path_effects` in the user-specified order, modulo deduplication and unobserved-path filtering. Composes with non-binary D and all downstream `by_path` surfaces (bootstrap, per-path placebos, per-path joint sup-t bands, `controls`, `trends_linear`, `trends_nonparam`) — mechanical filter on observed paths via the same `_enumerate_treatment_paths` call site, no methodology change. **Python-only API extension; no R equivalent** — R's `did_multiplegt_dyn(..., by_path=k)` only accepts a positive int (top-k) or `-1` (all paths). The `by_path` precondition gate at `chaisemartin_dhaultfoeuille.py:1118` (drop_larger_lower / L_max / `heterogeneity` / `design2` / `honest_did` / `survey_design` mutex) and the 11 `self.by_path is not None` activation branches in `fit()` were rerouted to fire under either selector. Validation + behavior + cross-feature regressions at `tests/test_chaisemartin_dhaultfoeuille.py::TestPathsOfInterest`.
 - **HAD `practitioner_next_steps()` handler + `llms-full.txt` reference section** (Phase 5). Adds `_handle_had` and `_handle_had_event_study` to `diff_diff/practitioner.py::_HANDLERS`, routing both `HeterogeneousAdoptionDiDResults` (single-period) and `HeterogeneousAdoptionDiDEventStudyResults` (event-study) through HAD-specific Baker et al. (2025) step guidance: `did_had_pretest_workflow` (step 3 — paper Section 4.2 step-2 closure on the event-study path), an estimand-difference routing nudge to `ContinuousDiD` (step 4 — fires when the user wants per-dose ATT(d) / ACRT(d) curves rather than HAD's WAS estimand and has never-treated controls; framed around estimand difference, NOT around the existence of untreated units, since HAD remains valid with a small never-treated share per REGISTRY § HeterogeneousAdoptionDiD edge cases and explicitly retains never-treated units on the staggered event-study path per paper Appendix B.2 / `had.py:1325`), `results.bandwidth_diagnostics` inspection on continuous designs and simultaneous (sup-t) `cband_*` reading on weighted event-study fits (step 6), per-horizon WAS event-study disaggregation (step 7), and the explicit design-auto-detection / last-cohort-only-WAS framing (step 8). Symmetric pair: `_handle_continuous` gains a Step-4 nudge to `HeterogeneousAdoptionDiD` for ContinuousDiD users on no-untreated panels (this direction is correct because ContinuousDiD's identification requires never-treated controls). Extends `_check_nan_att` with an ndarray branch via lazy `numpy` import for HAD's per-horizon `att` array; uses `np.all(np.isnan(arr))` semantics so partial-NaN arrays (legitimate event-study output under degenerate horizon-specific designs) do not over-fire the warning. Scalar path is bit-exact preserved across all 12 untouched handlers. Adds full HAD section + `HeterogeneousAdoptionDiDResults` / `HeterogeneousAdoptionDiDEventStudyResults` blocks + `## HAD Pretests` index covering all 7 pretest entry points + Choosing-an-Estimator row to `diff_diff/guides/llms-full.txt` (the bundled-in-wheel agent reference); the documented constructor + `fit()` signatures match the real `HeterogeneousAdoptionDiD.__init__` / `.fit` API exactly (verified by `inspect.signature`-based regression tests). Tightens the existing `Continuous treatment intensity` Choosing row to surface ATT(d) vs WAS as the estimand differentiator. `docs/doc-deps.yaml` updated to remove the `llms-full.txt` deferral note on `had.py` and add `llms-full.txt` entries to `had.py`, `had_pretests.py`, and `practitioner.py` blocks. Patch-level (additive on stable surfaces). 26 new tests (16 in `tests/test_practitioner.py::TestHADDispatch` + 9 in `tests/test_guides.py::TestLLMsFullHADCoverage` + 1 fixture-minimality regression locking the "handlers are STRING-ONLY at runtime" stability invariant). Closes the Phase 5 "agent surfaces" gap; T21 pretest tutorial and T22 weighted/survey tutorial remain queued as separate notebook PRs.
 
 ## [3.3.2] - 2026-04-26

diff --git a/benchmarks/R/generate_dcdh_dynr_test_values.R b/benchmarks/R/generate_dcdh_dynr_test_values.R
@@ -927,6 +927,97 @@ scenarios$multi_path_reversible_by_path_trends_nonparam <- list(
   results = extract_dcdh_by_path(res18, n_effects = 3, n_placebos = 1)
 )
 
+# Scenario 19: by_path + non-binary integer treatment (D in {0, 1, 2}).
+# Phase 3 Wave 3 #8 lift. Custom inline DGP (mirror Scenario 17 structure)
+# with 3 single-baseline non-binary paths: low-dose sustained
+# (0, 1, 1, 1), high-dose sustained (0, 2, 2, 2), and ramp-up
+# (0, 1, 2, 2). All F_g >= 4 (defensive: avoids any pre-window boundary
+# edge cases under future trends_lin combinations and matches Scenario 17).
+# 78 switchers + 20 never-treated (D=0) + 20 always-treated (D=2) controls.
+# n_periods=13, L_max=3.
+#
+# R's substr(path, 1, 1) baseline-derivation in did_multiplegt_by_path
+# is correct for D in {0..9} (single-digit decimal); we stay in {0, 1, 2}
+# so no R bug interferes. Python's tuple-key matching is correct
+# regardless of D range.
+cat("  Scenario 19: multi_path_reversible_by_path_non_binary\n")
+{
+  set.seed(119)
+  n_periods19 <- 13
+  L_max19 <- 3
+  target_paths19 <- list(
+    c(0L, 1L, 1L, 1L),  # path 1, low-dose sustained (rank 1)
+    c(0L, 2L, 2L, 2L),  # path 2, high-dose sustained (rank 2)
+    c(0L, 1L, 2L, 2L)   # path 3, ramp-up            (rank 3)
+  )
+  fg_path_counts19 <- list(
+    list(F_g = 4L, path_idx = 1L, count = 18L),
+    list(F_g = 5L, path_idx = 1L, count = 14L),
+    list(F_g = 6L, path_idx = 2L, count = 14L),
+    list(F_g = 7L, path_idx = 2L, count = 12L),
+    list(F_g = 8L, path_idx = 3L, count = 12L),
+    list(F_g = 9L, path_idx = 3L, count = 8L)
+  )
+  n_switchers19 <- sum(sapply(fg_path_counts19, function(x) x$count))
+  stopifnot(n_switchers19 == 78L)
+  D19 <- matrix(0L, nrow = n_switchers19, ncol = n_periods19)
+  g19 <- 1L
+  for (entry in fg_path_counts19) {
+    F_g <- entry$F_g
+    target <- target_paths19[[entry$path_idx]]
+    n_here <- entry$count
+    for (k in seq_len(n_here)) {
+      if (F_g >= 3L) D19[g19, 1:(F_g - 2L)] <- 0L
+      for (j in 0:L_max19) D19[g19, F_g - 1L + j] <- target[j + 1L]
+      if (F_g + L_max19 <= n_periods19) {
+        D19[g19, (F_g + L_max19):n_periods19] <- target[L_max19 + 1L]
+      }
+      g19 <- g19 + 1L
+    }
+  }
+  # Append 20 never-treated (D=0) and 20 always-treated (D=2) controls
+  D19 <- rbind(
+    D19,
+    matrix(0L, nrow = 20L, ncol = n_periods19),
+    matrix(2L, nrow = 20L, ncol = n_periods19)
+  )
+  n_total19 <- nrow(D19)
+  set.seed(119L)
+  group_fe19 <- rnorm(n_total19, 0, 2.0)
+  noise19 <- matrix(rnorm(n_total19 * n_periods19, 0, 0.5),
+                    nrow = n_total19, ncol = n_periods19)
+  period_arr19 <- 0:(n_periods19 - 1L)
+  Y19 <- 10.0 +
+    matrix(group_fe19, nrow = n_total19, ncol = n_periods19) +
+    matrix(0.1 * period_arr19, nrow = n_total19, ncol = n_periods19, byrow = TRUE) +
+    1.5 * D19 +
+    noise19
+  d19 <- data.frame(
+    group = rep(seq_len(n_total19) - 1L, each = n_periods19),
+    period = rep(period_arr19, n_total19),
+    treatment = as.vector(t(D19)),
+    outcome = as.vector(t(Y19))
+  )
+  res19 <- did_multiplegt_dyn(
+    df = d19, outcome = "outcome", group = "group", time = "period",
+    treatment = "treatment", effects = 3, placebo = 1, by_path = 3,
+    ci_level = 95
+  )
+  scenarios$multi_path_reversible_by_path_non_binary <- list(
+    data = list(
+      group = as.numeric(d19$group),
+      period = as.numeric(d19$period),
+      treatment = as.numeric(d19$treatment),
+      outcome = as.numeric(d19$outcome)
+    ),
+    params = list(pattern = "single_baseline_multi_path_non_binary",
+                  n_switcher_groups = 78L, n_realized_groups = 118L,
+                  n_periods = 13L, seed = 119L, effects = 3, placebo = 1,
+                  by_path = 3, ci_level = 95),
+    results = extract_dcdh_by_path(res19, n_effects = 3, n_placebos = 1)
+  )
+}
+
 # ---------------------------------------------------------------------------
 # Write output
 # ---------------------------------------------------------------------------