diff --git a/docs/methodology/papers/abadie-2021-review.md b/docs/methodology/papers/abadie-2021-review.md new file mode 100644 index 00000000..a043da52 --- /dev/null +++ b/docs/methodology/papers/abadie-2021-review.md @@ -0,0 +1,135 @@ +# Paper Review: Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects + +**Authors:** Alberto Abadie +**Citation:** Abadie, A. (2021). "Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects." *Journal of Economic Literature*, 59(2), 391–425. +**PDF reviewed:** https://doi.org/10.1257/jel.20191450 (published JEL version) +**Review date:** 2026-05-29 + +> Scope note: this is a **practical-guide / review article**. It recaps the synthetic-control estimator (attributed to Abadie & Gardeazabal 2003 and ADH 2010/2015) and contributes a synthesis on **feasibility, data requirements, contextual requirements, and inference**, plus a survey of extensions. Where it surveys other methods (Chernozhukov-Wüthrich-Zhu conformal inference; Arkhangelsky et al. synthetic DiD; Abadie-L'Hour / Ben-Michael et al. penalized & bias-corrected SC; Doudchenko-Imbens; Athey et al. matrix completion), those are **citations** — captured here only as Abadie frames them. The dedicated CWZ 2021 review is authoritative for conformal inference; the others are out of scope for this initiative. Nothing here is sourced from outside this paper. + +--- + +## Methodology Registry Entry + +*Formatted to match docs/methodology/REGISTRY.md. This is the richest source for the `## SyntheticControl` **assumption / warning** and **edge-case** sections.* + +## SyntheticControl + +**Primary source (this document):** Abadie, A. (2021). "Using Synthetic Controls…" *JEL*, 59(2), 391–425. https://doi.org/10.1257/jel.20191450 + +**Key implementation requirements:** + +*Notation (Section 3.1):* +- `J+1` units, `j=1` treated, donors `j=2,…,J+1`; `T` periods, first `T0` pre-intervention. `Y_jt` observed; `Ŷ^N_jt` synthetic prediction of the untreated potential outcome. `X_1` `(k×1)` treated-unit predictors (may include pre-period outcomes); `X_0` `(k×J)` donor predictors. `Z_j` observed covariates; `μ_j` unobserved factor loadings. + +*Target and estimator (Equations 1–3, 7–8):* + + (1) τ_{1t} = Y^I_{1t} − Y^N_{1t} (t > T0) + (2) Ŷ^N_{1t} = Σ_{j=2}^{J+1} w_j · Y_jt + (3)/(8) τ̂_{1t} = Y_{1t} − Σ_{j=2}^{J+1} w_j*·Y_jt + + (7) W* = argmin_W ( Σ_{h=1}^{k} v_h·(X_{h1} − Σ_{j} w_j·X_{hj})^2 )^{1/2} + s.t. w_j ≥ 0, Σ w_j = 1 ("constrained quadratic optimization") + +Footnote 8: assumptions are on `Y^N` only; since `Y_{1t}=Y^I_{1t}` is observed for `t>T0`, **no assumptions on the process generating `Y^I` are needed**. Equation (1) lets the effect vary freely over time. Special cases: equal weights `w_j=1/J` (4), population weights (5), single nearest neighbor `w_m=1` (6). + +*The justifying model and the identifying condition (Section 3.3):* + + (10) Y^N_{jt} = δ_t + θ_t·Z_j + λ_t·μ_j + ε_jt (linear factor / interactive-FE model) + +- **Generalizes DiD/TWFE:** restricting `λ_t = λ` (time-invariant) recovers parallel trends; the factor model relaxes this by letting loadings on `μ_j` vary in time (Bai 2009 cited). +- **Identifying condition:** if `X_1 = X_0 W*` (the synthetic control reproduces the treated unit's predictors **including pre-period outcomes**), then `τ̂_{1t}` is unbiased under (10). `μ_1` is unobserved and cannot be matched directly; a good pre-period-outcome match approximates it **only when the transitory-shock scale is small or `T0` is large**. A small `T0` with enough shock variation can produce a spurious pre-period match → **overfitting / bias**. +- **Bias bound (cited to ADH 2010):** bias is bounded by a function **inversely proportional to `T0`**, *provided the pre-period fit is good*. "**A large `T0` cannot drive down the bias if the fit is bad.**" The bound **increases with `J`** (donor-pool size) and with the **number of unobserved factors** (components of `μ_j`). + +*Feasibility / convex hull (Sections 3.3, 5):* +- In practice `X_1 = X_0 W*` is replaced by `X_1 ≈ X_0 W*`; **there are no ex-ante guarantees** on the size of `X_1 − X_0 W*`. When it is large, ADH 2010 recommend **against** using synthetic controls (potential for substantial bias). +- The treated unit's predictor point `(X_{11},…,X_{k1})` must fall **close to the convex hull** of the donors' points. If the treated unit is **"extreme"** in some predictor (or in pre-period outcomes), no weighted average reproduces it → "the conventional synthetic control estimator should not be used in that case." +- The simplex constraint **prevents extrapolation** but **not interpolation bias**: averaging away large discrepancies between dissimilar donors biases the estimate → **restrict the donor pool to similar units**. + +*`V` (predictor-importance) selection (Section 3.2; this paper formalizes the options):* +- **(a) Inverse-variance:** set `v_h = 1/Var(X_{h·})` (rescales each predictor row to unit variance). +- **(b) Nested MSPE minimization (AG 2003 / ADH 2010):** choose `V` so `W(V)` minimizes pre-period outcome MSPE `Σ_{t∈𝒯0} (Y_{1t} − Σ_j w_j(V)·Y_jt)²` over a set `𝒯0 ⊆ {1,…,T0}`. +- **(c) Out-of-sample cross-validation (ADH 2015), formalized 4-step (Equation 9):** split pre-period into training `1..t0` and validation `t0+1..T0` (concretely `t0 = T0/2`); compute `W̃(V)` on training data; pick `V*` minimizing validation MSPE (9); recompute `W* = W(V*)` using the validation-window predictors. +- **Footnote 7 (non-uniqueness):** CV weights need not be unique; can add a ridge-type penalty `γ·Σ_h v_h²` (`γ>0`) favoring dense weights. Demonstrate robustness to the `V` choice (Klößner et al. 2018 cited). + +*Predictor / variable selection (Section 3.4):* +- Predictors typically combine **pre-period outcomes** (crucial for matching `μ_j`; arise organically under a VAR DGP) **and** other covariates `Z_j`. Covariates omitted from `Z_j` are "mechanically absorbed into `μ_j`," increasing the bias bound — so **include real covariates**, don't rely on lagged outcomes alone. +- Flexibility: need not use every pre-period outcome; a **summary** (e.g., a pre-period mean) can suffice when outcomes co-move, and **increases weight sparsity** (number of nonzero `w_j` is controlled by the number of predictors). +- **Post-intervention outcomes are NOT used** to compute weights → weights are a **design-phase** object (safeguard against specification search / p-hacking; can be pre-registered). + +*Standard errors / inference (Sections 3.5, 8):* +- **No SEs in the classical sense.** Inference is **permutation / placebo-based** (design-based, conditioning on the sample), **not** sampling-based. Rationale: small / single treated unit, no randomization, sample often = population. +- **RMSPE-ratio permutation test (Equations 11–12):** + + (11) R_j(t1,t2) = ( (1/(t2−t1+1)) · Σ_{t=t1}^{t2} (Y_jt − Ŷ^N_jt)^2 )^{1/2} (RMSPE for unit j) + (12) r_j = R_j(T0+1, T) / R_j(1, T0) (post/pre ratio) + + `Ŷ^N_jt` is the synthetic control built treating unit `j` as treated (other `J` units as donors). p-value: + + p = (1/(J+1)) · Σ_{j=1}^{J+1} 𝟙₊(r_j − r_1) (fraction of units with ratio ≥ the treated unit's r_1) + + Alternative: use the distribution of post-period `R_j(T0+1,T)` after discarding placebos with pre-period `R_j(1,T0)` ≫ `R_1(1,T0)`. +- **Confidence intervals by test inversion** (Firpo & Possebom 2018 cited) — invert the permutation test over hypothesized effect values. +- **One-sided tests** via positive/negative parts `(Y_jt − Ŷ^N_jt)^±` of the gap → power gain (treated-unit-contaminated placebos tend to produce opposite-sign effects). +- **Visualize** the permutation distribution of `r_j` or of placebo gaps `Y_jt − Ŷ^N_jt` (conveys magnitude, not just a p-value). +- **Surveyed alternatives (citations — see dedicated reviews):** Chernozhukov-Wüthrich-Zhu (2021) **conformal inference** (time-permutation of constrained-LS residuals under the null, valid under residual **exchangeability**, weights re-estimated under the null using all periods); CWZ (2019b) bias-corrected CIs (asymptotically pivotal t-stat + cross-fitting, large `T0` and `T−T0`); Cattaneo-Feng-Titiunik **predictive intervals** (estimation + irreducible-error uncertainty); Hahn-Shi / Andrews (2003) **end-of-sample instability** test. + +*Edge cases / contextual requirements (Section 5 — the failure modes):* +- **Effect size vs. volatility:** small effects are masked by volatile outcomes; high *unit-specific* volatility raises overfitting risk → consider de-noising/filtering (only unit-specific noise hurts; common-factor volatility is differenced out by the SC). +- **No suitable comparison group:** exclude donors that (i) adopted a similar intervention, or (ii) suffered large idiosyncratic shocks not shared by the treated unit; restrict to comparable units (interpolation-bias control). +- **Anticipation:** if agents react before formal implementation, **backdate** the intervention. Backdating does **not** mechanically bias the estimator because (1)/(3) allow time-varying effects (unlike constant-effect panel models). +- **Interference / spillovers (SUTVA, Rubin 1980):** enforce in design (drop possibly-affected donors) or reason about the **sign of the bias** (e.g., negative spillover onto contributing donors → estimate is a *lower bound*). Sparsity + transparency of weights makes this feasible. +- **Outcome transformations & a differencing pitfall:** level mismatch can be handled via differences, growth rates, or **demeaning** `Ȳ_jt = Y_jt − (1/T0)Σ_{h≤T0} Y_jh` (≡ Doudchenko-Imbens constant shift). **But** differencing inflates the noise variance when `ε_jt` is roughly independent in time → higher overfitting/bias; the differenced model retains the factor structure `ΔY^N_jt = Δδ_t + Δθ_t Z_j + Δλ_t μ_j + Δε_jt`. +- **Short pre-period:** spurious (near-)perfect fit → unreliable counterfactual; mitigate with powerful non-outcome predictors (reduce residual variance). +- **Structural breaks:** a long `T0` risks violating constant-factor-loadings; up-weight (`v_h`) the most recent predictors to alleviate. +- **Time horizon:** effects may emerge slowly → need enough post-periods, or surrogate/leading indicators. + +*Sparsity (Section 4):* synthetic-control weights are **sparse** — when `X_1` is outside the donor convex hull and donors are in "general position," the solution is **unique with ≤ `k` nonzero weights** (projection of `X_1` onto the hull). Sparsity here is for **interpretability** (the identity/magnitude of nonzero weights matters), unlike lasso where sparsity is an anti-overfitting device. With many treated units inside the hull, weights may be non-unique (penalized SC restores uniqueness). + +**Reference implementation(s):** +- Authors' `Synth` package for **R, MATLAB, and Stata** (Section 3.2 footnote; documented in Abadie, Diamond & Hainmueller 2011, *J. Stat. Software* 42(13)). + +**Requirements checklist (guidance this paper adds beyond 2010/2015):** +- [ ] Convex-hull / "extreme treated unit" guard → warn / refuse when pre-period fit is poor or the treated unit is extreme. +- [ ] `V`-selection: inverse-variance, nested-MSPE, and CV (with a documented `t0=T0/2`-style default + optional ridge `γΣv_h²` for non-uniqueness). +- [ ] Encourage covariates in addition to lagged outcomes; allow pre-period-outcome summaries (sparsity). +- [ ] Permutation inference: RMSPE-ratio p-value `(#{r_j≥r_1})/(J+1)`; one-sided variants; CI by test inversion; placebo-distribution visualization. +- [ ] Weights computed from **pre-intervention data only** (design-phase guarantee). +- [ ] Diagnostics: in-time placebo / backdating, leave-one-out, donor-pool & predictor robustness. +- [ ] Warnings for the failure modes (volatility, contamination, anticipation, interference, differencing, short pre-period, structural breaks). + +--- + +## Implementation Notes + +### Data Structure Requirements +- Aggregate panel (outcome + predictors) for the treated unit and a curated donor pool; **large pre-intervention window**; enough post-periods for the effect to manifest; balanced panel; single (or few) treated units with block timing. + +### Computational Considerations +- Inner weight solve = constrained quadratic optimization over the simplex (Section 3.2 names it as such). +- `V` selection adds an outer loop (nested-MSPE or CV-validation evaluation). Permutation inference re-runs estimation `J` times (one pseudo-treated donor each). + +### Tuning Parameters + +| Parameter | Type | Default guidance (this paper) | Selection Method | +|-----------|------|-------------------------------|------------------| +| `V` (predictor importance) | nonneg vector | data-driven | inverse-variance; nested pre-period MSPE; or CV (`t0=T0/2`); optional ridge `γΣv_h²` for non-uniqueness | +| Predictors `X` | matrix | lagged outcomes + covariates | include real covariates; outcome summaries increase sparsity; data-driven via train/validation | +| Donor pool | set | curated, similar units | exclude treated-like / shocked / dissimilar units; limit size (overfitting) | +| Pre/post window | indices | as long a pre-window as structurally stable | backdate under anticipation; up-weight recent predictors under break risk | + +### Relation to Existing diff-diff Estimators +- Same `SyntheticControl` estimator as the 2010/2015 reviews. This paper is the source for the **assumptions/warnings** and **edge-case** REGISTRY content and for the **formalized CV `V`-selection** (`t0=T0/2`) and the **CI-by-test-inversion / one-sided** inference refinements (relevant to PR-2/PR-3). +- It positions **synthetic DiD (Arkhangelsky et al.)** — already implemented as `SyntheticDiD` — as "an SC that additionally weights pre-intervention time periods," confirming classic SCM is the unit-weights-only special case. +- It positions **conformal inference (CWZ)** as the sampling-based complement to permutation inference — the basis for PR-3 (authoritative details in the CWZ review). + +--- + +## Gaps and Uncertainties + +- **No new estimator/algorithm numerics.** The inner solver, `V`-search routine, and starting values are not specified (referenced to AG 2003 / ADH 2010 and the `Synth` software). The CV `t0=T0/2` split is explicitly "heuristic." +- **CV-weight non-uniqueness** is acknowledged (footnote 7) with a ridge remedy `γΣv_h²` but no default `γ`; an implementation must pick a deterministic tie-break. +- **Surveyed inference methods are citation-level here.** The conformal recipe (CWZ), predictive intervals (Cattaneo et al.), and bias-corrected CIs (CWZ 2019b) are summarized but their exact algorithms/assumptions must come from the primary papers (CWZ 2021 is reviewed separately; the others are out of scope). +- **Multiple treated units, penalized SC, bias correction, matrix completion** (Section 8) are surveyed (Eqs. 13–18 transcribed as Abadie presents them) but are **deferred** (augmented SC) or out of scope; not part of the classic-SCM implementation. +- **Effect-size/volatility de-noising** (singular-value thresholding, Amjad-Shah-Shen) is mentioned as mitigation but not prescribed — a judgment call left to the analyst. +- **"Extreme treated unit" / convex-hull check** is qualitative ("falls close to the convex hull") — a concrete numerical hull-distance or fit threshold for a warning must be chosen at implementation. diff --git a/docs/methodology/papers/abadie-diamond-hainmueller-2010-review.md b/docs/methodology/papers/abadie-diamond-hainmueller-2010-review.md new file mode 100644 index 00000000..52fb8563 --- /dev/null +++ b/docs/methodology/papers/abadie-diamond-hainmueller-2010-review.md @@ -0,0 +1,169 @@ +# Paper Review: Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco Control Program + +**Authors:** Alberto Abadie, Alexis Diamond, Jens Hainmueller +**Citation:** Abadie, A., Diamond, A., & Hainmueller, J. (2010). "Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco Control Program." *Journal of the American Statistical Association*, 105(490), 493–505. +**PDF reviewed:** https://doi.org/10.1198/jasa.2009.ap08746 (published JASA version) +**Review date:** 2026-05-29 + +> Scope note: this review captures **only** what is in Abadie, Diamond & Hainmueller (2010). The method originates in Abadie & Gardeazabal (2003) — cited here for the V-selection numerics (App. B of that paper) — and the leave-one-out / detailed in-time-placebo diagnostics are developed in ADH (2015); both are reviewed separately. Nothing here is sourced from outside this paper. + +--- + +## Methodology Registry Entry + +*Formatted to match docs/methodology/REGISTRY.md structure. Copy the `## SyntheticControl` section into the "Counterfactual / Synthetic Estimators" category.* + +## SyntheticControl + +**Primary source:** Abadie, A., Diamond, A., & Hainmueller, J. (2010). "Synthetic Control Methods for Comparative Case Studies." *JASA*, 105(490), 493–505. https://doi.org/10.1198/jasa.2009.ap08746 + +**Key implementation requirements:** + +*Assumption checks / warnings:* +- **One treated unit, block assignment.** Region `i = 1` is uninterruptedly exposed to the intervention after period `T0` (`1 ≤ T0 < T`); regions `i = 2, …, J+1` are the never-exposed "donor pool" (Section 2.2). No staggered adoption. +- **No interference / no spillovers (SUTVA across units).** Untreated regions' outcomes are unaffected by the treated unit's intervention (Section 2.2, citing Rosenbaum 2007). Section 3 discusses violations: e.g., cigarette smuggling into California or tobacco-industry ad-fund diversion to control states would contaminate the donor pool (journal pp. 501). +- **No anticipation.** `Y_it^I = Y_it^N` for all `t ≤ T0`. If the outcome reacts before formal implementation (anticipation), `T0` must be redefined to the first period the outcome may respond (journal p. 494). +- **Good pre-treatment fit is required, not assumed.** Equation (2) can hold exactly only if the treated unit's pre-period characteristics lie in the **convex hull** of the donors'. "In some instances, the fit may be poor and then we would not recommend using a synthetic control" (journal p. 495). The method *forces the researcher to demonstrate* fit (Table 1 predictor balance; Figure 2 pre-period trajectory). +- **Interpolation bias.** Even with good fit, bias can be large if the linear factor model does not hold across regions with very different characteristics; mitigate by restricting the donor pool to similar units, or by adding penalty terms to the weight objective (journal pp. 495–496). +- **Donor-pool curation.** Exclude units exposed to the same/similar intervention or to large confounding shocks (in the application: states with their own tobacco programs or large cigarette-tax hikes, and DC, are dropped; journal pp. 498–499). + +*Notation (Section 2.2):* +- `J+1` regions, `i = 1` treated; `t = 1, …, T`; `T0` = number of pre-intervention periods. +- `Y_it^N` = outcome without intervention; `Y_it^I` = outcome with intervention; observed `Y_it`. +- `Z_i` = `(r×1)` observed covariates **not affected by** the intervention; `δ_t` common factor; `θ_t` `(1×r)`; `λ_t` `(1×F)` unobserved common factors; `μ_i` `(F×1)` factor loadings; `ε_it` transitory shocks (mean zero). + +*Observed outcome and target (Section 2.2, journal p. 495):* + + Y_it = Y_it^N + α_it · D_it, D_it = 1{ i = 1 and t > T0 } + + α_1t = Y_1t^I − Y_1t^N = Y_1t − Y_1t^N, for t > T0 ← target (per-period treatment effect on the treated unit) + +`Y_1t` is observed, so estimating `α_1t` reduces to imputing the counterfactual `Y_1t^N`. + +*Model justifying the estimator (Equation 1):* + + Y_it^N = δ_t + θ_t·Z_i + λ_t·μ_i + ε_it + +This **generalizes the two-way fixed-effects / DiD model**: if `λ_t` is constant across `t`, Equation (1) collapses to the standard DiD (unobserved confounder effects constant in time, removed by differencing). The factor model lets the effect of unobserved confounders `μ_i` vary over time (journal p. 495). + +*Synthetic control and the weight vector `W`:* + + W = (w_2, …, w_{J+1})', w_j ≥ 0, Σ_{j=2}^{J+1} w_j = 1 (simplex constraint) + + synthetic-control outcome at t: Σ_{j=2}^{J+1} w_j · Y_jt + +*Identifying weights (Equation 2):* there exist `(w_2*, …, w_{J+1}*)` such that + + Σ_{j=2}^{J+1} w_j*·Y_jt = Y_1t for t = 1, …, T0, and Σ_{j=2}^{J+1} w_j*·Z_j = Z_1 + +*Estimator (as implemented):* + + α̂_1t = Y_1t − Σ_{j=2}^{J+1} w_j*·Y_jt, for t ∈ {T0+1, …, T} + +*Bias control (Equation 3, proved in Appendix B):* when `Σ_{t=1}^{T0} λ_t'λ_t` is nonsingular, + + Y_1t^N − Σ_j w_j*·Y_jt = Σ_j w_j* Σ_{s=1}^{T0} λ_t (Σ_{n=1}^{T0} λ_n'λ_n)^{-1} λ_s'(ε_js − ε_1s) − Σ_j w_j*(ε_jt − ε_1t) + +Appendix B (journal pp. 503–505) bounds `E|R_1t|` via Cauchy–Schwarz + Hölder + Rosenthal inequalities and shows **the bias → 0 as the number of pre-treatment periods `T0` grows** relative to the scale of transitory shocks. Practical implication: a long, well-fit pre-period is the key requirement. + +*Why fitting pre-period outcomes suffices (Equation 4):* if `Σ w_j* Z_j = Z_1` **and** `Σ w_j* μ_j = μ_1`, the estimator is unbiased. `μ_j` is unobserved, but fitting `Z_1` and a long set of pre-period outcomes `Y_11, …, Y_1T0` implies `Σ w_j* μ_j ≈ μ_1`, so (4) holds approximately (journal p. 495). + +*Single-pretreatment-period case (Equations 5–6):* under an autoregressive model with time-varying coefficients (Eq. 5), if weights satisfy `Σ w_j* Y_jT0 = Y_1T0` and `Σ w_j* Z_jT0 = Z_1T0` (Eq. 6), the estimator is unbiased even with one pre-period (Appendix B). + +*Weight estimation — the nested ("V-matrix") optimization (Section 2.3, journal p. 496):* + +Predictor vectors stack covariates and `M` linear combinations of pre-period outcomes. With `K_m = (k_1, …, k_{T0})'` defining `Ȳ_i^{K_m} = Σ_{s=1}^{T0} k_s·Y_is`: + + X_1 = (Z_1', Ȳ_1^{K_1}, …, Ȳ_1^{K_M})' (k×1), k = r + M + X_0 = (k×J) matrix, jth column (Z_j', Ȳ_j^{K_1}, …, Ȳ_j^{K_M})' + +**Inner problem (W given V):** + + W*(V) = argmin_W (X_1 − X_0·W)' V (X_1 − X_0·W) + s.t. w_j ≥ 0 (j = 2,…,J+1), Σ w_j = 1 + +where `V` is a `(k×k)` symmetric positive-semidefinite matrix weighting the predictors. The discrepancy norm is `‖X_1 − X_0 W‖_V = sqrt((X_1−X_0W)'V(X_1−X_0W))`. + +**Outer problem (choosing V):** "Although our inferential procedures are valid for any choice of `V`, the choice of `V` influences the mean square error." `V` may be chosen subjectively or **data-driven**. The paper's data-driven choice (journal p. 496): +- *(method used in the application)* Choose `V` among **positive-definite diagonal** matrices so that the synthetic control minimizes the **mean squared prediction error of the outcome over the pre-intervention periods** — i.e. `V* = argmin_V Σ_{t≤T0} (Y_1t − Σ_j w_j*(V)·Y_jt)²`. The numerical details are referenced to Abadie & Gardeazabal (2003, App. B). +- *(alternative)* **Cross-validation:** split the pre-period into a training period and a validation period; compute `W*(V)` on training data, then choose `V` to minimize the MSPE produced by `W*(V)` over the validation period. + +One obvious predictor choice is to use **all** pre-period outcomes `Y_i1, …, Y_iT0` as the `Ȳ_i^{K_m}` (journal p. 496). + +*Standard errors / inference (Section 2.4 + Section 3.4):* +- **No analytical standard error.** Large-sample inference is "not well suited" to comparative case studies with few units. The paper proposes **exact, permutation-style ("placebo") inference** valid regardless of the number of comparison units, periods, or aggregation level (journal pp. 496–497). +- **In-space placebo (permutation) test (journal pp. 501–503):** iteratively reassign the intervention to *each* donor unit, re-estimate the synthetic control, and obtain that unit's post-period gap. This yields a distribution of placebo gaps; the treated unit's effect is "significant" if its gap is unusually large relative to the placebo distribution. +- **RMSPE-ratio test statistic (preferred; journal p. 503):** for each unit compute + + ratio = (post-period MSPE) / (pre-period MSPE), + where MSPE over a window = average of squared gaps (Y_unit,t − synthetic_unit,t)² over that window. + + Rank the treated unit's ratio among all `J+1` units. The exact permutation p-value is `rank / (J+1)`. For California the ratio is ≈130× the next, the largest of all 39 units, giving **p = 1/39 = 0.026** (the only formal "significance" number in the paper). The ratio normalizes by pre-period fit, which **obviates choosing a pre-fit cutoff** for excluding ill-fitting placebos. +- **Pre-fit filtering (robustness display, not the primary test; journal p. 502):** alternative placebo plots discard donors whose pre-period MSPE exceeds 20× / 5× / 2× the treated unit's (Figures 5–7). The RMSPE-ratio test makes this filtering unnecessary. +- **In-time placebo:** mentioned as a related falsification idea — set the intervention date at random in the pre-period (citing Bertrand-Duflo-Mullainathan 2004; Heckman-Hotz 1989; journal p. 497) — but **no detailed procedure is given in this paper** (see the ADH 2015 review). + +*Edge cases:* +- Treated unit's pre-period vector far from the donor convex hull → poor fit → **do not use SCM** (journal p. 495). +- Highly nonlinear outcome–predictor relationship with wide predictor support → severe interpolation bias → restrict donor pool, or add penalty terms to `‖X_1 − X_0W‖` (journal p. 496). +- A predictor with near-zero `V` diagonal element has little predictive power for the pre-period outcome (in the application, log GDP per capita got a very small weight; journal p. 500). +- Placebo unit with poor pre-period fit produces a large post-period gap for the wrong reason → handle via the **RMSPE ratio** (normalizes by pre-fit) rather than raw gap (journal p. 502). +- Donor that itself experienced a similar intervention/shock → exclude from donor pool (journal pp. 498–499). + +*Algorithm (reconstructed from Sections 2.3–2.4 and Section 3):* +1. Build the donor pool (curate out contaminated/treated-like units) and the predictor set: covariates `Z` plus `M` linear combinations of pre-period outcomes (commonly some pre-period outcome averages and/or all pre-period outcomes). +2. Form `X_1` (treated predictors) and `X_0` (donor predictors). +3. **Inner:** for a candidate `V`, solve `W*(V) = argmin (X_1 − X_0W)'V(X_1 − X_0W)` over the unit simplex. +4. **Outer:** choose diagonal PSD `V*` minimizing pre-period outcome MSPE of `W*(V)` (or via train/validation cross-validation). +5. Counterfactual `Ŷ_1t^N = Σ_j w_j*(V*)·Y_jt`; effect path `α̂_1t = Y_1t − Ŷ_1t^N` for `t > T0`. +6. **Inference:** repeat steps 2–5 treating each donor as the pseudo-treated unit; compute each unit's post/pre MSPE ratio; the treated unit's permutation p-value is its rank among all `J+1` ratios divided by `J+1`. + +**Reference implementation(s):** +- The authors' `Synth` package for **MATLAB, R, and Stata** (companion software, journal pp. 493–494). (R: `Synth::synth()`.) + +**Requirements checklist:** +- [ ] Weights on the unit simplex (`w_j ≥ 0`, `Σ w_j = 1`); one treated unit, block assignment. +- [ ] Predictor matrix `X_1`/`X_0` = covariates `Z` + `M` linear combinations of pre-period outcomes (support "all pre-period outcomes" as a choice). +- [ ] Inner solve `W*(V)` = simplex-constrained weighted least squares with predictor-importance matrix `V` (diagonal PSD). +- [ ] Outer `V` selection: pre-period-MSPE minimization (default) and/or train/validation cross-validation; allow user-supplied `V`. +- [ ] Effect = gap path `Y_1t − Σ w_j* Y_jt` for post periods; report pre-period RMSPE (fit diagnostic) and predictor-balance table. +- [ ] In-space placebo permutation inference + post/pre RMSPE-ratio p-value (`rank/(J+1)`); pre-fit-filtered placebo plots as robustness. +- [ ] No analytical SE — inference is permutation/placebo only. + +--- + +## Implementation Notes + +### Data Structure Requirements +- Balanced panel: outcome `Y_it` for all units `i = 1, …, J+1` over all periods `t = 1, …, T`; exactly **one** treated unit with **block** (absorbing, common-date) treatment after `T0`. +- Time-invariant covariates `Z_i` (not affected by the intervention) and the pre-period outcome series (for the `Ȳ^{K}` predictors). +- Donor pool explicitly curated (analyst-supplied exclusions). + +### Computational Considerations +- Two-level optimization: an inner simplex-constrained quadratic program `W*(V)` nested inside an outer search over diagonal `V`. The outer objective is non-smooth in `V` (the inner argmin has kinks where the simplex active set changes); the paper references AG (2003, App. B) for numerics and does not specify the optimizer. +- Inference cost: the in-space placebo loop re-runs the full nested estimation once per donor (`J` extra fits). +- Aggregate-level data suffice; no micro-data needed (a stated advantage, journal p. 497). + +### Tuning Parameters + +| Parameter | Type | Default | Selection Method | +|-----------|------|---------|-----------------| +| Donor pool | set of units | all eligible controls | Analyst curation (exclude treated-like / shocked units) | +| Predictors `X` (covariates `Z` + `Ȳ^{K_m}`) | matrix | covariates + pre-period outcome summaries; or all pre-period outcomes | Analyst choice of predictive variables | +| `V` (predictor-importance matrix) | diagonal PSD `k×k` | data-driven | Minimize pre-period outcome MSPE of `W*(V)`; or train/validation cross-validation; or user-supplied | +| `T0` (pre/post split) | period index | intervention date | Set to first period outcome may react (anticipation guard) | + +### Relation to Existing diff-diff Estimators +- **`SyntheticDiD` (Arkhangelsky et al. 2021)** is the closest existing estimator: it uses unit *and* time weights with ridge regularization and a double-difference estimator; classic SCM uses **only donor (unit) weights** and a level-matching estimator (no time weights, no ridge). Equation (1) here shows classic SCM **generalizes DiD** (recovered when `λ_t` is constant). +- The inner simplex solve is the same shape as the Frank-Wolfe weight problem already in `diff_diff/utils.py` (`_sc_weight_fw`) once `V^½` is folded into the predictor matrix — but classic SCM adds the **outer `V` search**, which SyntheticDiD has no analog for. +- The placebo/permutation inference resembles SyntheticDiD's `variance_method="placebo"` in spirit, but the **post/pre RMSPE-ratio statistic** and the `rank/(J+1)` p-value are specific to this paper. + +--- + +## Gaps and Uncertainties + +- **V-optimization numerics are not in this paper.** Section 2.3 (journal p. 496) describes the outer objective (minimize pre-period outcome MSPE over diagonal PSD `V`) and a CV alternative, but defers the numerical details to **Abadie & Gardeazabal (2003), Appendix B** and the `Synth` software. The exact optimizer, starting values, and any normalization of `V` must be pinned from the `Synth` source / AG 2003 at implementation time, not from this paper. +- **Outer-objective norm.** The inner discrepancy uses `‖·‖_V`; the *outer* `V`-selection minimizes the plain (unweighted) pre-period outcome MSPE. The paper is explicit that inferential validity holds for *any* `V`, so the outer choice is an efficiency device, not an identification requirement (journal p. 496). +- **p-value granularity.** The permutation p-value is `rank/(J+1)`; with a small donor pool the smallest attainable p-value is `1/(J+1)` (here `1/39 = 0.026`). No confidence intervals are produced (a separate inference layer — conformal — is reviewed via CWZ 2021). +- **In-time placebos** are mentioned (journal p. 497) but not proceduralized here; the leave-one-out donor-robustness diagnostic is **absent** from this paper (both belong to the ADH 2015 review). +- **Cross-validation `V`** is described but **not** the method used in the Prop 99 application (which minimized pre-period MSPE directly); the paper does not give a default train/validation split. +- **Penalty-augmented weights** ("`‖X_1 − X_0W‖` plus penalty terms") are mentioned for interpolation-bias control (journal p. 496) but not formalized into a specific penalty (this anticipates later penalized-SC work, out of scope here). diff --git a/docs/methodology/papers/abadie-diamond-hainmueller-2015-review.md b/docs/methodology/papers/abadie-diamond-hainmueller-2015-review.md new file mode 100644 index 00000000..12127999 --- /dev/null +++ b/docs/methodology/papers/abadie-diamond-hainmueller-2015-review.md @@ -0,0 +1,126 @@ +# Paper Review: Comparative Politics and the Synthetic Control Method + +**Authors:** Alberto Abadie, Alexis Diamond, Jens Hainmueller +**Citation:** Abadie, A., Diamond, A., & Hainmueller, J. (2015). "Comparative Politics and the Synthetic Control Method." *American Journal of Political Science*, 59(2), 495–510. +**PDF reviewed:** https://doi.org/10.1111/ajps.12116 (published AJPS version) +**Review date:** 2026-05-29 + +> Scope note: this review captures only ADH (2015). The synthetic-control *estimator* itself (weights, `V`-matrix) is stated here but **attributed by the paper to Abadie & Gardeazabal (2003) and ADH (2010)**. This paper's own contributions are (a) the **diagnostics / robustness layer** — in-time placebos, leave-one-out donor removal, and the post/pre-RMSPE-ratio permutation test; (b) **out-of-sample cross-validation** for choosing the predictor-importance weights `V`; and (c) the **regression-vs-synthetic-control extrapolation** result. Results the paper merely *cites* to 2003/2010 are flagged as such; nothing here is sourced from outside this paper. + +--- + +## Methodology Registry Entry + +*Formatted to match docs/methodology/REGISTRY.md. This complements the ADH-2010 entry for `## SyntheticControl`; here the focus is the diagnostics layer, CV-based `V` selection, and the extrapolation result.* + +## SyntheticControl + +**Primary source (this document):** Abadie, A., Diamond, A., & Hainmueller, J. (2015). "Comparative Politics and the Synthetic Control Method." *AJPS*, 59(2), 495–510. https://doi.org/10.1111/ajps.12116 + +**Key implementation requirements:** + +*Notation (Section "Synthetic Control Method", journal pp. 497–498):* +- `J+1` units; `j=1` treated; donors `j=2,…,J+1` (the "donor pool", `J` units). Balanced panel `t=1,…,T`; `T0` pre-periods, `T1` post-periods, `T=T0+T1`. No effect in `1,…,T0`. +- `Y_1` = `(T1×1)` post-period outcomes of the treated unit; `Y_0` = `(T1×J)` post-period outcomes of donors. +- `X_1` = `(k×1)` pre-intervention characteristics of the treated unit (**may include pre-intervention outcome values**); `X_0` = `(k×J)` for donors; row `m` = predictor `m`. Predictors are "not affected by the intervention." + +*Weights and the simplex constraint (journal p. 497):* + + W = (w_2, …, w_{J+1})', 0 ≤ w_j ≤ 1, Σ_{j=2}^{J+1} w_j = 1 + +*Weight optimization (Equation 1; attributed to AG 2003 / ADH 2010):* + + W* = argmin_W Σ_{m=1}^{k} v_m · (X_{1m} − X_{0m}·W)^2 s.t. simplex + +where `v_m ≥ 0` reflects the predictive importance of predictor `m`. Seminorm form (footnote 5): `‖u‖ = sqrt(u'Vu)` for PSD `V`; with `V` diagonal `= diag(v_1,…,v_k)`, minimizing `‖X_1 − X_0 W‖` equals minimizing Equation (1). `W*` is **invariant to the scale of `(v_1,…,v_k)`**, so `V`'s diagonal can be normalized to sum to one. + +*Estimator (journal p. 498):* + + τ̂_{1t} = Y_{1t} − Σ_{j=2}^{J+1} w_j*·Y_{jt} (post periods); vector form Y_1 − Y_0·W* + +*`V` selection — out-of-sample cross-validation (this paper's method; journal pp. 501–502):* +1. Split the pre-period into a **training** period and a **validation** period (application: training 1971–1980, validation 1981–1990). +2. For each candidate `V`, compute weights `W̃(V)` using **training-period** predictor data. +3. Choose `V*` to minimize the **RMSPE over the validation period**: `Σ_{t∈valid} (Y_{1t} − Σ_j w̃_j(V)·Y_{jt})²`. +4. Re-estimate `W* = W(V*)` using predictor data from the **last part of the pre-period** (validation-window predictors). +- **Footnote 17 (deviation note):** AG 2003 / ADH 2010 instead choose `V` so the synthetic control best fits the **pre-intervention outcome path**; for the German example this produces "almost identical" results to the CV method used here. + +*Standard errors / inference (journal pp. 499–505):* +- **No standard errors, confidence intervals, or posterior distributions** (explicit, journal p. 500). Inference is restricted to "whether the estimated effect of the actual intervention is large relative to the distribution of placebo effects." +- **In-space placebo / permutation test:** apply SCM treating *each* donor as the pseudo-treated unit; build the distribution of placebo effects. p-value = **fraction of units whose (placebo) effect ≥ the treated unit's effect**; reduces to classical randomization inference under randomization (Rosenbaum 2005). +- **RMSPE-ratio test statistic (preferred):** with pre-period RMSPE (footnote 16) + + RMSPE = ( (1/T0) · Σ_{t=1}^{T0} ( Y_{1t} − Σ_{j=2}^{J+1} w_j*·Y_{jt} )^2 )^{1/2} + + compute `ratio = post-period RMSPE / pre-period RMSPE` for every unit; rank the treated unit's ratio. The ratio "avoid[s] having to discard countries with pre-period values that cannot be approximated" (footnote 19, crediting ADH 2010). Application: West Germany's ratio is the largest of 17 units → permutation p-value `1/17 ≈ 0.059`. +- **In-time placebo (this paper's diagnostic):** reassign the intervention to an earlier date in the pre-period, re-estimate the synthetic control with the **same CV technique and predictors lagged accordingly**, and check whether a spurious effect appears. The application reassigns reunification to **1975** (~15 yrs before the actual 1990 date — **Figure 4 is titled "Placebo Reunification 1975"**) and finds no perceptible placebo gap; **footnote 18** reports the same exercise reassigned to **1970 and 1980** ("similar to the results for 1975"). +- **Leave-one-out / iterative donor removal (this paper's diagnostic):** re-estimate the synthetic control **omitting, one at a time, each donor that received positive weight**; overlay the leave-one-out counterfactual trajectories to gauge how much results depend on any single donor. + +*Regression vs. synthetic control — extrapolation (journal pp. 498–499, 503):* +The regression-based counterfactual `B̂'X_1` with `B̂=(X_0 X_0')^{-1}X_0 Y_0'` equals `Y_0·W^{reg}` with + + W^{reg} = X_0'(X_0 X_0')^{-1} X_1 + +If a constant is included, `ι'W^{reg}=1` — i.e., regression is *also* a weighting estimator summing to one, but with **unrestricted weights** (can be negative or >1), so it **extrapolates** outside the donor convex hull. Detectable by computing `W^{reg}` and observing weights outside `[0,1]`. (In the application, regression assigned negative weights to Greece/Italy/Portugal/Spain.) + +*Edge cases / practical guidance:* +- **Convex hull / no extrapolation:** the simplex constraint keeps the synthetic control inside the donors' convex hull (no model-dependent extrapolation; King & Zeng 2006 cited, journal p. 496). +- **Interpolation bias + non-uniqueness (footnote 10, journal p. 499):** even with no extrapolation, interpolation bias can be severe if donors have very different characteristics → **restrict the donor pool to similar units**; the `‖X_1−X_0W‖` objective can be **augmented with penalty terms** on discrepancies, which *also* help **select among multiple solutions** when `X_1` lies inside the convex hull (the objective then has non-unique minimizers). +- **Donor-pool curation (journal p. 500):** (1) exclude units affected by the same/similar intervention; (2) exclude units with large idiosyncratic shocks not shared by the treated unit; (3) restrict to units with characteristics similar to the treated unit (interpolation-bias control). +- **Overfitting (journal p. 500):** a large donor pool can artificially match the treated unit by combining idiosyncratic variation → restrict donor-pool size; motivates the CV `V`-selection. +- **Sparsity vs. fit (journal pp. 506–507):** synthetic controls are typically sparse; reducing to `l` contributing units (`l=4,3,2`) degrades fit "moderately"; `l=1` (single match) is much worse and "comes close to a difference-in-differences design" (footnote 23, citing ADH 2010 for the SC↔DiD relationship). +- **No interference among contributing donors:** spillovers onto donors *with positive weight* bias the estimate; spillovers onto zero-weight donors do not affect estimates (journal p. 504). + +*Algorithm (in-time placebo + leave-one-out + RMSPE-ratio, reconstructed):* +1. Estimate the baseline synthetic control (weights `W*`, `V` via CV); record pre/post RMSPE and the gap path. +2. **In-space placebo:** for each donor `j`, treat `j` as pseudo-treated (donors = all other `J` units), re-estimate, record `r_j = postRMSPE_j/preRMSPE_j`. p-value = `(#{j: r_j ≥ r_1})/(J+1)`. +3. **In-time placebo:** re-estimate with the intervention date moved into the pre-period (lag predictors accordingly); confirm no spurious gap. +4. **Leave-one-out:** for each donor with `w_j*>0`, drop it, re-estimate, overlay trajectories. + +**Reference implementation(s):** +- Authors' `Synth` package for **R, MATLAB, and Stata** (footnote 7). R: `Synth` (CRAN), documented in Abadie, Diamond & Hainmueller (2011), *J. Stat. Software* 42(13). Stata: `ssc install synth`. + +**Requirements checklist (this paper's additions):** +- [ ] Out-of-sample cross-validation option for `V` (training/validation split), in addition to the pre-period-outcome-fit method. +- [ ] In-time placebo (date reassignment with predictor lagging). +- [ ] Leave-one-out donor robustness (drop each positively-weighted donor). +- [ ] Post/pre-RMSPE-ratio permutation p-value `(#{r_j ≥ r_1})/(J+1)`. +- [ ] Regression-weight (`W^{reg}`) extrapolation diagnostic (flag weights outside `[0,1]`). +- [ ] Donor-pool curation + size limit (overfitting guard); optional penalty terms for interpolation bias / tie-breaking. + +--- + +## Implementation Notes + +### Data Structure Requirements +- Balanced panel, single treated unit, block treatment after `T0`; donor pool curated per the three rules above. Predictors `X` may mix covariates and pre-period outcomes (or pre-period outcome summaries). + +### Computational Considerations +- The inner weight solve (Equation 1) is **constrained quadratic optimization** over the simplex (the paper calls it that; it does not specify the solver here). +- CV `V`-selection adds an outer search over `V` evaluated on a validation window. +- Inference loops: in-space placebo re-fits once per donor (`J` fits); leave-one-out re-fits once per positively-weighted donor; in-time placebo is a handful of re-fits at alternative dates. Sparse-SC subset search (`l Scope note: this paper provides an **inference layer** (valid p-values and confidence intervals) for synthetic-control and related counterfactual estimators — it is the basis for the planned conformal-inference path (PR-3) on top of the classic `SyntheticControl` estimator. It is **not** a new point estimator. Everything below is sourced from this paper; the canonical SC estimator it nests is the constrained-least-squares form (its own §2.3), which differs from the classic ADH V-matrix estimator in ways flagged under Gaps. + +--- + +## Methodology Registry Entry + +*Formatted to match docs/methodology/REGISTRY.md. This documents the conformal-inference layer for `## SyntheticControl`.* + +## SyntheticControl — Conformal Inference (Chernozhukov-Wüthrich-Zhu) + +**Primary source:** Chernozhukov, V., Wüthrich, K., & Zhu, Y. (2021). "An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls." *JASA*, 116(536), 1849–1864. https://doi.org/10.1080/01621459.2021.1920957 + +**Key implementation requirements:** + +*Setting & notation (Section 1, 2.1):* +- One treated unit `j=1`, observed for `T0` pre-intervention periods and `T_* = T − T0` post periods (`T_*` typically **short** relative to `T0`). `J ≥ 1` control units `j=2,…,J+1`, observed all `T` periods; optional covariates `X_jt`. (Multiple treated units → Appendix A.2.) +- Potential outcomes `Y^I_{1t}`, `Y^N_{1t}`; policy effect `θ_t = Y^I_{1t} − Y^N_{1t}`. Control outcomes equal their no-intervention values: `Y_jt = Y^N_jt`, `j≥2`. + +*Counterfactual Model (Assumption 1, "CMF" — the fundamental identifying assumption):* + + Y^N_{1t} = P^N_t + u_t + Y^I_{1t} = P^N_t + θ_t + u_t E(u_t) = 0, t = 1,…,T + +where `{P^N_t}` is a **mean-unbiased proxy** for the counterfactual (`E[P^N_t] = E[Y^N_{1t}]`), and `{u_t}` is a centered stationary stochastic process **whose distribution is invariant under the intervention**. Observed: `Y_{1t} = Y^N_{1t} + D_t(Y^I_{1t} − Y^N_{1t})`, `D_t = 1{t>T0}`. No restriction on the dependence between `{P^N_t}` and `{u_t}`. + +*Hypotheses (Section 2.2):* the **sharp null** over the post-period trajectory `θ = (θ_{T0+1},…,θ_T)'`: + + H0: θ = θ0 = (θ0_{T0+1}, …, θ0_T)' (eq. 1) + +It fully pins the counterfactual: under H0, `Y^N_{1t} = Y_{1t} − θ0_t` for `t>T0`. (Per-period `H0: θ_t = θ0_t` is used for CIs; average-effect nulls in Appendix A.1.) + +*Algorithm (the conformal test):* +1. **Build data under the null** `Z(θ0)`: subtract `θ0_t` from the **post-period** treated outcomes (pre-period unchanged); keep controls and covariates. + + Z_t = (Y^N_{1t}, Y_{2t},…,Y_{J+1,t}, X'_{·t})' t ≤ T0 + Z_t = (Y_{1t} − θ0_t, Y_{2t},…,Y_{J+1,t}, X'_{·t})' t > T0 + +2. **Estimate the proxy `P̂^N_t` UNDER THE NULL on ALL `T` periods** (not pre-period only) using any nested estimator (SC, constrained Lasso, DiD, factor/MC, …). *Estimating under the null is essential for good small-sample size and for exact validity.* +3. **Residuals:** `û_t = Y^N_{1t} − P̂^N_t`, `t = 1,…,T` (`Y^N_{1t}` here means the null-imputed value). +4. **Test statistic** (high → reject): + + S_q(û) = ( (1/√T_*) · Σ_{t=T0+1}^{T} |û_t|^q )^{1/q} + + `q=1` (S1) is the application default (robust to heavy tails); `q=2` (S2) for permanent effects; `q=∞` for large temporary effects. For an **average** effect, `S(û) = T_*^{-1/2} |Σ_{t>T0} û_t|` (Remark 1). +5. **Permutation p-value (eq. 2):** + + p̂ = 1 − F̂(S(û)), F̂(x) = (1/|Π|) · Σ_{π∈Π} 1{ S(û_π) < x } + + where `û_π = (û_{π(1)},…,û_{π(T)})'`. *Footnote 7:* if the proxy estimator is **invariant to time permutations of the data** (true for SC, constrained Lasso, DiD, factor/PCA, matrix completion — NOT for AR/time-series proxies), then permuting residuals ≡ permuting data, so the proxy is fit **once** and only residuals are permuted. + +*Permutation schemes (`Π`):* (always includes the identity) +- **i.i.d. permutations `Π_all`** — all `T!` permutations of `{1,…,T}`; use when `{u_t}` is i.i.d. (Assumption 2.1). Gives precise p-values / low significance levels; sample randomly (e.g., 10,000 draws) if `T!` is large. +- **Moving-block permutations `Π_→`** — `T` **cyclic shifts** (wrap-around), indexed `j=0,…,T−1`: + + π_j(i) = i + j if i + j ≤ T + π_j(i) = i + j − T otherwise + + use when `{u_t}` is stationary, strongly mixing (Assumption 2.2; ARMA/GARCH). `|Π_→| = T`. +- (i.i.d. block permutations `Π_mb` over a partition into blocks — footnote 6; secondary.) + +*Confidence intervals — Algorithm 1 (pointwise, by test inversion):* + + (i) choose a fine grid Θ̃_t = {θ̃0_{1t},…,θ̃0_{Gt}} of candidate values + (ii) for each θ̃0_t: build Z under H0: θ_t = θ̃0_t, recompute p̂(θ̃0_t) via eq. (2) + (iii) C_{1−α}(t) = { θ̃0_t ∈ Θ̃_t : p̂(θ̃0_t) > α } + +**Cost:** one proxy re-fit per grid value (each `θ̃0_t` defines a different `Z`). One-sided variants and average/aggregate effects (collapse to non-overlapping `T_*`-blocks, effective sample `T/T_*`) in Appendix A.1. + +*Synthetic-control proxy (Section 2.3; eqs. 3–4, 13):* + + P^N_t = Σ_{j=2}^{J+1} w_j Y^N_jt, w ≥ 0, Σ_{j=2}^{J+1} w_j = 1 (3) + (SC) E( u_t Y^N_jt ) = 0, j = 2,…,J+1 (identification) + ŵ = argmin_w Σ_{t=1}^{T} ( Y^N_{1t} − Σ_{j} w_j Y^N_jt )^2 s.t. simplex (4) + +Eq. (4)'s objective sums over **all `t = 1,…,T`**, and **footnote 9** states it explicitly: *"unlike Doudchenko and Imbens (2016), we estimate `w` under the null hypothesis based on all the data"* — i.e. §2.3's canonical SC estimator is fit on the null-imputed `Z(θ0)` over **all `T` periods**, *not* pre-period-only (the classic-ADH convention). Covariates may be folded in (the ADH 2010/2015 versions), and the method also works with modified SC (e.g. augmented SC, Ben-Michael et al. 2018). **Constrained Lasso** (eqs. 5–6, 14) generalizes with an optional intercept `μ` and `‖w‖_1 ≤ K` (natural `K=1`); it is essentially **tuning-free** and **nests DiD** (`w_j = 1/J`) **and canonical SC** (`μ=0, w≥0`). General penalized form (§2.3.3) allows Lasso/Elastic-Net penalties toward a focal `w0`. + +*Validity (two routes; finite-sample bounds → exact as `T0→∞`):* +- **Route (i) — consistency:** Assumption 3 (proxy MSE and pointwise error `→0` under the null) + Assumption 2 (`{u_t}` i.i.d. → `Π_all`, or stationary strongly-mixing → `Π_→`). **Theorem 1:** `|P(p̂ ≤ α) − α| ≤ C(δ̃_T + δ_T + √δ_T + γ_T)`, `δ̃_T = (T_*/T0)^{1/4} log T` (with `T_*` fixed). **Lemma 1** verifies Assumption 3 for constrained-LS/SC/Lasso, allowing `J` large (`log J = o(T^c)`) and requiring **no sparsity**. +- **Route (ii) — stability (misspecification-robust):** Assumption 4 (estimator stable under perturbing a few observations; `β̂` need NOT converge) + Assumption 5 (β-mixing data). **Theorem 2** bounds size with `Π_→`. Verified for constrained Lasso and Ridge (Appendices E–F). +- **Exact finite-sample validity under exchangeability** (Appendix D): imposing the null ⇒ permutation-invariant estimator ⇒ exchangeable residuals ⇒ exact size, **model-free** (e.g., DiD differencing makes residuals exchangeable even when data are not). + +*Edge cases / conditions:* +- `T0` must be **large** (drives exactness); `T_*` may be small/fixed. Imposing the null is what rescues small-`T0` size (empirically excellent at `T0=19`). +- **Remark 2:** conditional heteroscedasticity in `{u_t}` is allowed; **unconditional** heteroscedasticity in `{u_t}` is **not** — apply an extra filter to get standardized residuals if suspected. +- If the **shock distribution changes** under the intervention (Assumption 1 invariance fails), the test becomes a test of "no impact whatsoever" (Appendix B, structural-break interpretation); or treat `θ_t` as random → valid **prediction sets** (Appendix C). +- For **time-series (AR) proxies**, residual permutation ≠ data permutation: estimate the AR parameters on residuals and permute the **innovations** (Lemmas 5–7). + +*Placebo diagnostic (Appendix A.3):* an **in-time placebo** — test `H0: θ_{T0−τ+1}=⋯=θ_{T0}=0` on **pre-period data only**, treating the last `τ` pre-periods as a pseudo-post-period; rejection undermines credibility. Useful to compare credibility across DiD vs SC vs constrained Lasso. Cannot test Assumption 1's shock-invariance. + +**Reference implementation(s):** +- Authors state "all computations were performed in R"; **no named package/repo** is cited in the article body. (Abadie 2021 refers to this as CWZ with available software; verify the package name separately if needed.) + +**Requirements checklist (conformal layer for `SyntheticControl`):** +- [ ] Build `Z(θ0)` (subtract post-period `θ0`); estimate proxy **under the null on all `T` periods**. +- [ ] Residuals + `S_q` statistic (`q=1` default; expose `q∈{1,2,∞}`). +- [ ] Permutation p-value (eq. 2); both `Π_all` (i.i.d.; random sampling fallback) and `Π_→` (moving-block cyclic shifts) schemes. +- [ ] Pointwise CIs via Algorithm 1 (grid + test inversion); one-sided + average-effect (block-collapse) variants. +- [ ] Reuse residual-permutation shortcut only for permutation-invariant proxies (SC/Lasso/DiD); AR path needs innovation permutation. +- [ ] Document `T0`-large / `T_*`-small requirement and the unconditional-heteroscedasticity caveat. + +--- + +## Implementation Notes + +### Data Structure Requirements +- Single treated unit, block timing, balanced time series with **large `T0`**, short `T_*`; control outcomes (and covariates) over all `T`. Multiple treated units handled by per-unit application or cross-unit averaging (A.2). + +### Computational Considerations +- **Fit-once** then permute residuals for SC/Lasso/DiD (footnote 7) → the p-value for a single null is cheap (`|Π|` statistic evaluations, no re-fit). `Π_→` has only `T` elements; `Π_all` is sampled (e.g., 10k draws). +- **CIs are the cost driver:** Algorithm 1 re-estimates the proxy once per grid value per period (each `θ̃0_t` changes `Z`). Warm-starting across the grid and bounding the grid are natural optimizations (this matches the plan's PR-3 "test-inversion cost" risk). + +### Tuning Parameters + +| Parameter | Type | Default | Selection | +|-----------|------|---------|-----------| +| `q` (statistic norm) | int/∞ | `1` (S1) | `1`/`2` permanent effects, `∞` large temporary | +| Permutation set `Π` | scheme | `Π_→` (moving block) if serial dependence; `Π_all` if i.i.d. | by error-dependence assumption | +| `K` (constrained Lasso ℓ1 bound) | float | `1` | tuning-free at `K=1`; SC ⇔ `K=1,w≥0,μ=0` | +| CI grid `Θ̃_t`, `α` | grid, level | fine grid | resolution vs cost | +| `R`, `k` (Theorem 2) | — | — | **theory only — NOT exposed** | + +### Relation to Existing diff-diff Estimators +- This is the **PR-3 inference layer** for `SyntheticControl`. It yields what ADH placebo inference cannot: **valid p-values for the effect trajectory and pointwise confidence intervals**. +- Its **constrained-LS SC** (no `V`-matrix, weights on all periods under the null) differs from the classic ADH V-matrix estimator. The conformal *machinery* (build `Z(θ0)` → fit proxy under null → residuals → `S_q` → permute → invert) is estimator-agnostic; layering it onto the classic ADH SC is a design choice (see Gaps). +- The moving-block permutation + test-inversion CI is **greenfield** in `diff_diff` (no existing conformal code); the `1/(n+1)`-style discreteness of permutation p-values resembles `diagnostics.py`'s permutation p-value floor, but the residual-permutation/test-inversion mechanics are new. +- Constrained Lasso **nests DiD and SC**, mirroring the factor-model→DiD reduction noted in the ADH reviews. + +--- + +## Gaps and Uncertainties + +- **Which proxy to pair with the conformal layer.** CWZ's exact/robust guarantees are derived for its **constrained-LS SC weights estimated under the null on all `T` periods**. The classic ADH `SyntheticControl` (PR-1) uses **V-matrix weights on pre-period predictors only**. Applying CWZ conformal inference to the *classic* estimator means either (a) re-estimating ADH weights under the null on all periods per grid value (faithful to CWZ, but changes the estimator's weight semantics and is costly), or (b) treating ADH weights as the proxy and accepting that the exactness theory (Lemma 1) was proven for the constrained-LS form. **This is the central PR-3 design decision** and must be resolved against this paper + the planned estimator API. +- **No named software.** The article cites only "computations in R"; the implementing package name/repo is not in the body (Abadie 2021 references CWZ's software as available — locate separately if a reference implementation is wanted for parity). +- **Permutation-invariance requirement.** The fit-once shortcut (footnote 7) holds only for time-permutation-invariant proxies. If `SyntheticControl` ever incorporates time-ordered components, the residual-permutation equivalence breaks (use the AR-style innovation permutation, Lemmas 5–7). +- **Unconditional heteroscedasticity / non-stationarity of `{u_t}`** invalidates the basic procedure (Remark 2) — needs a standardizing filter; no default filter is prescribed. +- **CI grid specification** (`Θ̃_t` range/resolution) and the choice between `S1`/`S2`/`S∞` and `Π_all`/`Π_→` are left to the analyst; defaults must be chosen (application used `S1` + both schemes; `Π_all` sampled at 10k). +- **Appendix-level material** (proofs in Appendix H; stability sufficient conditions in E–F; simulation design in G) was summarized, not transcribed; consult the supplement if the exactness proof or stability conditions are needed verbatim during PR-3.