Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
767c8e9
Add WooldridgeDiD (ETWFE) estimator from PR #216
igerber Apr 4, 2026
cfe0e5c
Address AI review findings: fix P0 control group, P1 rank deficiency,…
igerber Apr 4, 2026
6446bac
Address CI review: fix never_treated, add identification checks, Pois…
igerber Apr 4, 2026
168c2d3
Address remaining CI review findings: vcov on reduced design, identif…
igerber Apr 4, 2026
2f9667c
Fix bootstrap clustering, delta-method reduced space, remaining doc c…
igerber Apr 4, 2026
940739d
Add regression tests for bootstrap clustering and nonlinear rank defi…
igerber Apr 4, 2026
01dc4ea
Fix anticipation propagation in identification checks and control met…
igerber Apr 4, 2026
8a70087
Fix Poisson dropped-cell bug, document anticipation aggregation seman…
igerber Apr 4, 2026
7fa18fe
Add cohort time-invariance check, anticipation metadata in results
igerber Apr 4, 2026
56aad1f
Add nonlinear outcome validation, Poisson rank_deficient_action valid…
igerber Apr 4, 2026
f25ea09
Implement interacted covariate ETWFE, validate set_params
igerber Apr 4, 2026
84fc6ed
Reject covariates for nonlinear ETWFE, fix set_params validation
igerber Apr 4, 2026
93f4d21
Restore covariate support for nonlinear ETWFE with correct ASF
igerber Apr 4, 2026
465d323
Fix OLS unbalanced demeaning, nonlinear never_treated identification,…
igerber Apr 4, 2026
c847eab
Merge origin/main, resolve CHANGELOG conflict
igerber Apr 4, 2026
2102b3d
Implement full Wooldridge covariate basis (D_g × X, f_t × X)
igerber Apr 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Added
- **WooldridgeDiD (ETWFE)** estimator — Extended Two-Way Fixed Effects from Wooldridge (2025, 2023). Supports OLS, logit, and Poisson QMLE paths with ASF-based ATT and delta-method SEs. Four aggregation types (simple, group, calendar, event) matching Stata `jwdid_estat`. Alias: `ETWFE`. (PR #216, thanks @wenddymacro)
- **Survey real-data validation** (Phase 9) — 15 cross-validation tests against R's `survey` package using three real federal survey datasets:
- **API** (R `survey` package): TSL variance with strata, FPC, subpopulations, covariates, and Fay's BRR replicates
- **NHANES** (CDC/NCHS): TSL variance with strata + PSU + nest=TRUE, validating the ACA young adult coverage provision DiD
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ Detailed guide: [`docs/llms-practitioner.txt`](docs/llms-practitioner.txt)
- **Wild cluster bootstrap**: Valid inference with few clusters (<50) using Rademacher, Webb, or Mammen weights
- **Panel data support**: Two-way fixed effects estimator for panel designs
- **Multi-period analysis**: Event-study style DiD with period-specific treatment effects
- **Staggered adoption**: Callaway-Sant'Anna (2021), Sun-Abraham (2021), Borusyak-Jaravel-Spiess (2024) imputation, Two-Stage DiD (Gardner 2022), Stacked DiD (Wing, Freedman & Hollingsworth 2024), and Efficient DiD (Chen, Sant'Anna & Xie 2025) estimators for heterogeneous treatment timing
- **Staggered adoption**: Callaway-Sant'Anna (2021), Sun-Abraham (2021), Borusyak-Jaravel-Spiess (2024) imputation, Two-Stage DiD (Gardner 2022), Stacked DiD (Wing, Freedman & Hollingsworth 2024), Efficient DiD (Chen, Sant'Anna & Xie 2025), and Wooldridge ETWFE (2021/2023) estimators for heterogeneous treatment timing
- **Triple Difference (DDD)**: Ortiz-Villavicencio & Sant'Anna (2025) estimators with proper covariate handling
- **Synthetic DiD**: Combined DiD with synthetic control for improved robustness
- **Triply Robust Panel (TROP)**: Factor-adjusted DiD with synthetic weights (Athey et al. 2025)
Expand Down Expand Up @@ -117,6 +117,7 @@ All estimators have short aliases for convenience:
| `Stacked` | `StackedDiD` | Stacked DiD |
| `Bacon` | `BaconDecomposition` | Goodman-Bacon decomposition |
| `EDiD` | `EfficientDiD` | Efficient DiD |
| `ETWFE` | `WooldridgeDiD` | Wooldridge ETWFE (2021/2023) |

`TROP` already uses its short canonical name and needs no alias.

Expand Down
6 changes: 1 addition & 5 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,11 +83,7 @@ Implements local projections for dynamic treatment effects. Doesn't require spec

### Nonlinear DiD

For outcomes where linear models are inappropriate (binary, count, bounded).

- Logit/probit DiD for binary outcomes
- Poisson DiD for count outcomes
- Proper handling of incidence rate ratios and odds ratios
Implemented in `WooldridgeDiD` (alias `ETWFE`) — OLS, Poisson QMLE, and logit paths with ASF-based ATT. See [Tutorial 16](docs/tutorials/16_wooldridge_etwfe.ipynb).

**Reference**: [Wooldridge (2023)](https://academic.oup.com/ectj/article/26/3/C31/7250479). *The Econometrics Journal*.

Expand Down
4 changes: 4 additions & 0 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,10 @@ Deferred items from PR reviews that were not addressed before merge.
| StaggeredTripleDifference: per-cohort group-effect SEs include WIF (conservative vs R's wif=NULL). Documented in REGISTRY. Could override mixin for exact R match. | `staggered_triple_diff.py` | #245 | Low |
| HonestDiD Delta^RM: uses naive FLCI instead of paper's ARP conditional/hybrid confidence sets (Sections 3.2.1-3.2.2). ARP infrastructure exists but moment inequality transformation needs calibration. CIs are conservative (wider, valid coverage). | `honest_did.py` | #248 | Medium |
| Replicate weight tests use Fay-like BRR perturbations (0.5/1.5), not true half-sample BRR. Add true BRR regressions per estimator family. Existing `test_survey_phase6.py` covers true BRR at the helper level. | `tests/test_replicate_weight_expansion.py` | #253 | Low |
| WooldridgeDiD: QMLE sandwich uses `aweight` cluster-robust adjustment `(G/(G-1))*(n-1)/(n-k)` vs Stata's `G/(G-1)` only. Conservative (inflates SEs). Add `qmle` weight type if Stata golden values confirm material difference. | `wooldridge.py`, `linalg.py` | #216 | Medium |
| WooldridgeDiD: aggregation weights use cell-level n_{g,t} counts. Paper (W2025 Eqs. 7.2-7.4) defines cohort-share weights. Add optional `weights="cohort_share"` parameter to `aggregate()`. | `wooldridge_results.py` | #216 | Medium |
| WooldridgeDiD: canonical link requirement (W2023 Prop 3.1) not enforced — no warning if user applies wrong method to outcome type. Estimator is consistent regardless, but equivalence with imputation breaks. | `wooldridge.py` | #216 | Low |
| WooldridgeDiD: Stata `jwdid` golden value tests — add R/Stata reference script and `TestReferenceValues` class. | `tests/test_wooldridge.py` | #216 | Medium |

#### Performance

Expand Down
149 changes: 149 additions & 0 deletions benchmarks/python/benchmark_wooldridge.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
#!/usr/bin/env python3
"""
Benchmark: WooldridgeDiD (ETWFE) Estimator (diff-diff WooldridgeDiD).

Validates OLS ETWFE ATT(g,t) against Callaway-Sant'Anna on mpdta data
(Proposition 3.1 equivalence), and measures estimation timing.

Usage:
python benchmark_wooldridge.py --data path/to/mpdta.csv --output path/to/results.json
"""

import argparse
import json
import os
import sys
from pathlib import Path

# IMPORTANT: Parse --backend and set environment variable BEFORE importing diff_diff
def _get_backend_from_args():
"""Parse --backend argument without importing diff_diff."""
parser = argparse.ArgumentParser(add_help=False)
parser.add_argument("--backend", default="auto", choices=["auto", "python", "rust"])
args, _ = parser.parse_known_args()
return args.backend

_requested_backend = _get_backend_from_args()
if _requested_backend in ("python", "rust"):
os.environ["DIFF_DIFF_BACKEND"] = _requested_backend

# NOW import diff_diff and other dependencies (will see the env var)
import pandas as pd

# Add parent to path for imports
sys.path.insert(0, str(Path(__file__).parent.parent.parent))

from diff_diff import WooldridgeDiD, HAS_RUST_BACKEND
from benchmarks.python.utils import Timer


def parse_args():
parser = argparse.ArgumentParser(
description="Benchmark WooldridgeDiD (ETWFE) estimator"
)
parser.add_argument("--data", required=True, help="Path to input CSV data (mpdta format)")
parser.add_argument("--output", required=True, help="Path to output JSON results")
parser.add_argument(
"--backend", default="auto", choices=["auto", "python", "rust"],
help="Backend to use: auto (default), python (pure Python), rust (Rust backend)"
)
return parser.parse_args()


def get_actual_backend() -> str:
"""Return the actual backend being used based on HAS_RUST_BACKEND."""
return "rust" if HAS_RUST_BACKEND else "python"


def main():
args = parse_args()

actual_backend = get_actual_backend()
print(f"Using backend: {actual_backend}")

print(f"Loading data from: {args.data}")
df = pd.read_csv(args.data)

# Run OLS ETWFE estimation
print("Running WooldridgeDiD (OLS ETWFE) estimation...")
est = WooldridgeDiD(method="ols", control_group="not_yet_treated")

with Timer() as estimation_timer:
results = est.fit(
df,
outcome="lemp",
unit="countyreal",
time="year",
cohort="first_treat",
)

estimation_time = estimation_timer.elapsed

# Compute event study aggregation
results.aggregate("event")
total_time = estimation_timer.elapsed

# Store data info
n_units = len(df["countyreal"].unique())
n_periods = len(df["year"].unique())
n_obs = len(df)

# Format ATT(g,t) effects
gt_effects_out = []
for (g, t), cell in sorted(results.group_time_effects.items()):
gt_effects_out.append({
"cohort": int(g),
"time": int(t),
"att": float(cell["att"]),
"se": float(cell["se"]),
})

# Format event study effects
es_effects = []
if results.event_study_effects:
for rel_t, effect_data in sorted(results.event_study_effects.items()):
es_effects.append({
"event_time": int(rel_t),
"att": float(effect_data["att"]),
"se": float(effect_data["se"]),
})

output = {
"estimator": "diff_diff.WooldridgeDiD",
"method": "ols",
"control_group": "not_yet_treated",
"backend": actual_backend,
# Overall ATT
"overall_att": float(results.overall_att),
"overall_se": float(results.overall_se),
# Group-time ATT(g,t)
"group_time_effects": gt_effects_out,
# Event study
"event_study": es_effects,
# Timing
"timing": {
"estimation_seconds": estimation_time,
"total_seconds": total_time,
},
# Metadata
"metadata": {
"n_units": n_units,
"n_periods": n_periods,
"n_obs": n_obs,
"n_cohorts": len(results.groups),
},
}

print(f"Writing results to: {args.output}")
output_path = Path(args.output)
output_path.parent.mkdir(parents=True, exist_ok=True)
with open(output_path, "w") as f:
json.dump(output, f, indent=2)

print(f"Overall ATT: {results.overall_att:.6f} (SE: {results.overall_se:.6f})")
print(f"Completed in {total_time:.3f} seconds")
return output


if __name__ == "__main__":
main()
7 changes: 7 additions & 0 deletions diff_diff/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,8 @@
TROPResults,
trop,
)
from diff_diff.wooldridge import WooldridgeDiD
from diff_diff.wooldridge_results import WooldridgeDiDResults
from diff_diff.utils import (
WildBootstrapResults,
check_parallel_trends,
Expand Down Expand Up @@ -210,6 +212,7 @@
Stacked = StackedDiD
Bacon = BaconDecomposition
EDiD = EfficientDiD
ETWFE = WooldridgeDiD

__version__ = "2.8.4"
__all__ = [
Expand Down Expand Up @@ -276,6 +279,10 @@
"EfficientDiDResults",
"EDiDBootstrapResults",
"EDiD",
# WooldridgeDiD (ETWFE)
"WooldridgeDiD",
"WooldridgeDiDResults",
"ETWFE",
# Visualization
"plot_bacon",
"plot_event_study",
Expand Down
Loading
Loading