Avoid mutating sklearn estimators in DiD experiments by drbenvincent · Pull Request #693 · pymc-labs/CausalPy

drbenvincent · 2026-02-02T20:40:39Z

Summary

Avoid mutating user-supplied sklearn estimators by cloning when fit_intercept=False is required
Emit a UserWarning when the clone occurs so users are aware and can fix their code
Centralize fit_intercept handling in BaseExperiment._ensure_sklearn_fit_intercept_false()
Add integration coverage to assert original estimators are unchanged and the warning fires
Fix codecov/patch failure by ensuring full branch coverage of the new method

Details

DiD experiments require fit_intercept=False because the design matrix already contains an explicit intercept column (~ 1 + ...). Previously, the experiment constructor mutated the user's model in-place (model.fit_intercept = False), which was a hidden side effect. Now the method clones the estimator via sklearn.base.clone(), sets the parameter on the copy, and issues a warning explaining why.

Coverage fix

The _ensure_sklearn_fit_intercept_false() method in base.py has a defensive isinstance guard that returns early for non-sklearn models (line 70). Originally, the method was called from inside elif isinstance(self.model, RegressorMixin): blocks in both diff_in_diff.py and staggered_did.py, which made the guard unreachable — the caller had already confirmed the model was sklearn, so the isinstance check could never be True. This caused the codecov/patch check to fail.

The fix moves the _ensure_sklearn_fit_intercept_false() call to before the model-type dispatch (the if PyMCModel / elif RegressorMixin block) in both experiment classes. Since the method already has a built-in guard for non-sklearn models, calling it unconditionally is safe and semantically cleaner. Now any PyMC integration test naturally exercises the early-return path, providing full branch coverage without artificial unit tests.

A new integration test test_did_sklearn_fit_intercept_false was also added to cover the "no warning needed" path (when the model already has fit_intercept=False).

Testing

MPLCONFIGDIR=/tmp/mplconfig XDG_CACHE_HOME=/tmp conda run -n CausalPy python -m pytest -o addopts= causalpy/tests/test_integration_skl_examples.py::test_did causalpy/tests/test_integration_skl_examples.py::test_did_sklearn_fit_intercept_false causalpy/tests/test_staggered_did.py::test_staggered_did_sklearn causalpy/tests/test_staggered_did.py::test_staggered_did_sklearn_model_without_fit_intercept

Issues

Closes DifferenceInDifferences mutates fit_intercept on user models #664

read-the-docs-community · 2026-02-02T20:45:14Z

Documentation build overview

📚 causalpy | 🛠️ Build #32278659 | 📁 Comparing 5945c82 against latest (4769e80)

🔍 Preview build

4 files changed

± 404.html
± _modules/causalpy/experiments/base.html
± _modules/causalpy/experiments/diff_in_diff.html
± _modules/causalpy/experiments/staggered_did.html

codecov · 2026-02-09T13:13:29Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.34%. Comparing base (4769e80) to head (5945c82).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #693      +/-   ##
==========================================
+ Coverage   94.32%   94.34%   +0.02%     
==========================================
  Files          78       78              
  Lines       12159    12186      +27     
  Branches      713      713              
==========================================
+ Hits        11469    11497      +28     
  Misses        497      497              
+ Partials      193      192       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…tion tests Move the _ensure_sklearn_fit_intercept_false() call from inside the `elif isinstance(self.model, RegressorMixin)` block to before the model-type dispatch in both DiD and StaggeredDiD. This makes the isinstance early-return guard (line 70 in base.py) naturally reachable by existing PyMC integration tests, fixing the codecov/patch failure. Also add test_did_sklearn_fit_intercept_false integration test covering the "no warning needed" path when fit_intercept is already False. Co-authored-by: Cursor <cursoragent@cursor.com>

Include the ruff-format whitespace fix so the PR's prek check passes remotely. Made-with: Cursor

Use the intended warning-and-clone flow to keep diff coverage above the Codecov patch threshold. Made-with: Cursor

drbenvincent added the bug Something isn't working label Feb 2, 2026

drbenvincent marked this pull request as draft February 2, 2026 21:29

drbenvincent marked this pull request as ready for review February 9, 2026 13:06

drbenvincent mentioned this pull request Feb 21, 2026

Centralize fit_intercept=False handling in ScikitLearnAdaptor #726

Closed

drbenvincent force-pushed the codex/issue-664-fit-intercept branch from 83e476a to d6a9a00 Compare March 4, 2026 10:46

drbenvincent and others added 5 commits April 15, 2026 19:51

Avoid mutating sklearn models in DiD experiments

3c3c6da

slight change in approach

dfc321b

Format BaseExperiment helper after rebase resolution.

795bcbc

Include the ruff-format whitespace fix so the PR's prek check passes remotely. Made-with: Cursor

Simplify sklearn fit_intercept clone path in BaseExperiment.

5945c82

Use the intended warning-and-clone flow to keep diff coverage above the Codecov patch threshold. Made-with: Cursor

drbenvincent force-pushed the codex/issue-664-fit-intercept branch from 97042ed to 5945c82 Compare April 15, 2026 18:52

drbenvincent requested review from cetagostini and juanitorduz April 15, 2026 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid mutating sklearn estimators in DiD experiments#693

Avoid mutating sklearn estimators in DiD experiments#693
drbenvincent wants to merge 5 commits intomainfrom
codex/issue-664-fit-intercept

drbenvincent commented Feb 2, 2026 •

edited

Loading

Uh oh!

read-the-docs-community bot commented Feb 2, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

drbenvincent commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Coverage fix

Testing

Issues

Uh oh!

read-the-docs-community bot commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Documentation build overview

Uh oh!

codecov bot commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

drbenvincent commented Feb 2, 2026 •

edited

Loading

read-the-docs-community bot commented Feb 2, 2026 •

edited

Loading

codecov bot commented Feb 9, 2026 •

edited

Loading