Placebo tests and extended plotting functionality #88
Placebo tests and extended plotting functionality #88n-delaney wants to merge 57 commits intoebenmichael:masterfrom
Conversation
Rmd file and data for replication of Abadie et al. (2010)
-updates functions in augsynth.R -adds permutation.R (also contains some plotting functions) -updates package documentation
Updates the github directory in tobacco replication vignette
Replaces convention of inf = F with inf_type = 'none'.
- Separates out inference from estimation. Sets augsynth(inf = "none") by default, which should be backwards compatible. - Adds plot_type as an argument for plot.augsynth (allows user to specify the "name" of the desired outputted plot). - Adds the option to generate a donor table (synthetic control weights and RMSPEs for any inference type) - Ensures that all existing tests execute successfully.
Updates to abadie et al 2010 replication notebook
Adds warnings to multi-outcome and multisynth models if a user attempts to pass in `inf_type` at the time of model fitting.
Minor edits to tobacco replication notebook
…synth class methods for accessing data characteristics.
- converts the tobacco replication notebook to a vignette - removes cache folders and plot folder from the Abadie et al. (2010) tobacco replication notebook
…ome of the MDES calculation stuff from the permutation package to clarify inference side.
ebenmichael
left a comment
There was a problem hiding this comment.
Overall I think there needs to be a structural change to keep inference calculations in summary.augsynth. The new add_inference function seems to be necessary only to add inference output to the augsynth function, but that should be restricted to the summary.augsynth function. I've called out parts of the code where this can be changed, but mostly it means that all of the new plotting work should happen in the plot.summary.augsynth function, and then the augsynth object never needs any inference added to it.
|
|
||
|
|
||
|
|
||
| test_that( "MDES_table corresponds to default treatment table", { |
There was a problem hiding this comment.
Can this test be broken up into subtests, since it looks like there are a few distinct things.
There was a problem hiding this comment.
Another good test would be that the placebo gaps agree with manually changing the treatment variable in the data and re-running augsynth.
… into tobacco example more.
- adds a test for update_augsynth() ensuring that non-binding update criteria (ex. RMPSE multiple of infinity) return the same donor weights as the original model. - removes superfluous datasets froom the tobacco_replication.Rmd file - updates a test in test-donor_control.R to resolve an error from a variable that is called but not defined (or used as part of the test, as far as I can tell)
…to unite checking of that setting.
ebenmichael
left a comment
There was a problem hiding this comment.
Thanks for all of these changes. I left reviews throughout. Once those are done (and the merge conflict is fixed) we can fold this in.
|
|
||
| # These should be the same? Or close to the same? | ||
| # (They were not under ridge, probably due to the cross-validation procedure.) | ||
| gaps7 <- augsynth:::get_placebo_gaps(syn7) |
There was a problem hiding this comment.
Can you test this with ridge with a fixed lambda parameter? If they should be same then the test should pass.
There was a problem hiding this comment.
@n-delaney @lmiratrix the question marks here make me nervous
|
Trying to put in my comments |
ebenmichael
left a comment
There was a problem hiding this comment.
An overarching request: Please make sure that the top-level calls to the summary and plot functions don't change their default behavior and that the exisiting vignette still works as it should (it doesn't give the same results as-is). If the vignette needs to be updated (e.g. to incorporate the new plot_type argument, please do that).
| plot.augsynth <- function(augsynth, | ||
| cv = FALSE, | ||
| plot_type = 'estimate', | ||
| inf_type = NULL, ...) { | ||
|
|
||
| plot_augsynth_results( augsynth=augsynth, cv=cv, plot_type=plot_type, inf_type=inf_type, ... ) |
There was a problem hiding this comment.
My understanding is that the cv option no longer does anything, and instead that is subsumed into plot_type correct? Can you update that throughout the code so that it isn't passing in cv if that isn't doing anything?
There was a problem hiding this comment.
I went back through an ensured that the existing plot() calls will generate the same plots as previously. The vignette runs and delivers the expected plots without any changes the the code blocks.
That being said, the vignette doesn't currently showcase the plot_type argument (or specify the four plot types — "estimate", "outcomes", "cv", and "permutation") that you can get out of it. I will leave it up to you if and how you want to showcase these features.
One thing to note—in order to make plot.augsynth() and plot.summary.augsynth() backwards compatible, they will return a cross-validation plot if the user sets cv=TRUE regardless of plot_type setting.
| #' @param ... Optional arguments | ||
| plot_augsynth_results <- function( augsynth, | ||
| plot_type = 'estimate', | ||
| inf_type = NULL, ...) { |
There was a problem hiding this comment.
The default behavior of the package is to do conformal inference. Can you make whatever changes are needed so that this continues to be the case? Right now if you follow the vignette it doesn't produce the prediction intervals.
There was a problem hiding this comment.
Also, the plots used to have the inf argument that governed whether uncertainty intervals were included. That should stay in to not break any existing code. It also simplifies the plot_type argument because then there doesn't need to be the 'estimate only' option.
There was a problem hiding this comment.
I have reincorporated inf, which now controls whether confidence intervals appear using plot_type = "estimate". I have also stripped down the plot types to four distinct arguments (i.e. I removed a differentiation between plot_type = "estimate" (with CI) and plot_type = "estimate only" (without CI), since this is now covered by inf.
| aes(x = !!as.name(augsynth$time_var), y = !!as.name(measure), | ||
| color = trt_status, linetype = !!as.name(augsynth$unit_var))) + | ||
| ggplot2::geom_line() + | ||
| ggplot2::geom_vline(lty = 2, xintercept = treat_year) + |
There was a problem hiding this comment.
For some reason the vertical line doesn't appear when you get here by running plot rather than permutation_plot.
e.g. the following creates two slightly different plots
syn <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas,
progfunc = "None", scm = T)
# no vertical line at treatment time
plot(syn, plot_type = "placebo")
# yes vertical line at treatment time
permutation_plot(syn)
There was a problem hiding this comment.
Fixed! This was an issue with plotting from a summary or augsynth object
| } else if (plot_type == 'outcomes') { | ||
| p <- augsynth_outcomes_plot(augsynth, measure = 'synth') | ||
| } else if (plot_type == 'outcomes raw average') { | ||
| p <- augsynth_outcomes_plot(augsynth, measure = c('synth', 'average')) |
There was a problem hiding this comment.
These should have the option to add the uncertainty intervals (i.e. with inf = TRUE following the original logic of the code)
There was a problem hiding this comment.
Done! Please see comment above on a related note
| p <- p + ggplot2::geom_ribbon(ggplot2::aes(ymin = Estimate - 2 * Std.Error, | ||
| ymax = Estimate + 2 * Std.Error), | ||
| alpha = 0.2) |
There was a problem hiding this comment.
Something is happening here where if you just run plot(syn) for a augsynth object syn, the default behavior doesn't include uncertainty quantification but this line of geom_ribbon is running with a warning about all missing inputs. This might go away if the default behavior is to plot conformal inference intervals, but still it comes up using plot(syn, inf_type = NULL)
There was a problem hiding this comment.
I noticed that the geom_ribbon() warning was appearing on both the main and pr_merge branches. I think it is coming from "NA" values for confidence intervals in pre-treatment periods. I have resolved this in the pr_merge branch.
|
|
||
| #' Calculate MDES | ||
| #' | ||
| #' Use our method for Calculating SEs, p-values, and MDEs by looking |
There was a problem hiding this comment.
Ok can you just strip this out here then? Alternatively, you can out in a citation etc. Something more than "our" method, since I'd like for it to be clear what different functions are doing.
| units$weights[ units$ever_Tx == 1 ] = 0 | ||
|
|
||
| # confirm that we have placebos for every entity over every observed time period | ||
| stopifnot("Placebos do not cover every entity over every observed time period." = (ncol(augsynth$data$X) + ncol(augsynth$data$y)) * nrow(augsynth$data$y) == nrow(lest)) |
There was a problem hiding this comment.
@lmiratrix @n-delaney does this ever happen or were you adding in a hypothetical check?
|
|
||
|
|
||
| # Impact is difference between observed and imputed control-side outcome | ||
| lest$impact = lest$Yobs - lest$Yhat |
There was a problem hiding this comment.
Please remove it if it isn't needed
|
|
||
| # These should be the same? Or close to the same? | ||
| # (They were not under ridge, probably due to the cross-validation procedure.) | ||
| gaps7 <- augsynth:::get_placebo_gaps(syn7) |
There was a problem hiding this comment.
@n-delaney @lmiratrix the question marks here make me nervous
- Simplify plot.augsynth() / plot.summary.augsynth() behavior around inference and plot types (should ensure backwards compatibility)
- Remove plot_type options:
* "estimate only" → use inf = FALSE to suppress CIs when using `plot_type` = "estimate" (ATT)
* "outcomes raw average" → outcomes plot now always includes treated, synthetic, and donor average
- Make cv = TRUE override other plotting options (backwards compatibility)
- Preserve inf as the switch controlling CI display (even when inf_type is supplied)
- Update summary plotting:
* summary.augsynth plots inherit stored inference and do not recompute at plot time
* placebo plots require permutation-based inference for summary objects
* plot.augsynth() defaults to permutation inference for placebo plots unless
inf_type = "permutation_rstat" is explicitly requested
- Update permutation_plot() to inherit inference type from object
- Fix CV plot warnings (geom_point annotations)
- Update (revert) vignette to reflect new plotting behavior
- Update plotting tests to match new API
Notes:
- plot(desyn) shows slightly different confidence intervals relative to vignette from main branch; possibly due to inference changes, not plotting?
- Some tests (test_multisynth_covariates.R, test_outcome_models.R) currently fail due to numerical tolerance;
temporarily disabled with skip("temporarily disabled")
Merge branch 'pr_merge' of https://github.com/n-delaney/augsynth_permutation into pr_merge # Conflicts: # tobacco_replication/tobacco_exploration.R
The proposed changes:
All of the changes to the augsynth package itself take place in augsynth.R and a new file, permutation.R.
The
tobacco_replicationdirectory contains a vignette demonstrating the new functionality and replicating the results of Abadie et al. (2010).