Skip to content

Fix AGI-targeted geography assignment#682

Merged
baogorek merged 1 commit intomainfrom
codex/fix-agi-geo-followup
Apr 3, 2026
Merged

Fix AGI-targeted geography assignment#682
baogorek merged 1 commit intomainfrom
codex/fix-agi-geo-followup

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

  • fix AGI-weighted block probabilities so district totals match AGI target shares instead of population-share times AGI target
  • load district AGI targets through UnifiedMatrixBuilder using the caller-provided calibration database
  • add focused tests for the AGI probability math and calibration wiring

Testing

  • uv run pytest policyengine_us_data/tests/test_calibration/test_clone_and_assign.py policyengine_us_data/tests/test_calibration/test_unified_matrix_builder.py policyengine_us_data/tests/test_calibration/test_unified_calibration.py -q
  • uv run ruff format --check policyengine_us_data/calibration/clone_and_assign.py policyengine_us_data/calibration/unified_calibration.py policyengine_us_data/calibration/unified_matrix_builder.py policyengine_us_data/tests/test_calibration/test_clone_and_assign.py policyengine_us_data/tests/test_calibration/test_unified_calibration.py policyengine_us_data/tests/test_calibration/test_unified_matrix_builder.py

Follow-up to #671.

@baogorek
Copy link
Copy Markdown
Collaborator

baogorek commented Apr 2, 2026

Review (on behalf of Ben)

Both bug fixes look correct:

Math fix in _build_agi_block_probs: The old code (pop_probs * agi_weights / sum) conflated population distribution with AGI targets — a district with 90% of blocks but a low AGI target would still dominate. Dividing by within-district population mass first makes district totals proportional to just the AGI targets while preserving within-district block shares. The new test case makes the expected behavior clear.

db_path shadowing fix: Line 944 on main hardcodes db_path = str(STORAGE_FOLDER / "calibration" / "policy_data.db"), which shadows the db_path parameter. Moving UnifiedMatrixBuilder construction earlier and routing through get_district_agi_targets() fixes this cleanly.

Questions / minor notes

  1. Silent filter changes in get_district_agi_targets vs the old raw SQL. The old query returned all reform IDs, all domain scopes, and didn't do best-period selection (the dict just kept the last row arbitrarily). The new method via _query_targets adds three filters that weren't there before: baseline only (reform_id == 0), unconditional only (empty domain_variable), and most-recent-period selection. All three seem like improvements, but the PR description doesn't mention them — were these intentional?

  2. builder.dataset_path mutation (line ~1027 in the new code): the builder is constructed without dataset_path, then it's set as an attribute later. This works fine today since __init__ just stores it, but it's a minor fragility — if construction-time validation is ever added, this pattern would break. Not a blocker, just flagging it.

@baogorek
Copy link
Copy Markdown
Collaborator

baogorek commented Apr 2, 2026

Heads up — CI is failing with:

ValueError: Cannot downsample cps_2021: found 9 dataset variables missing from the current country package (count_under_18, count_under_6, hourly_wage, is_union_member_or_covered, other_type_retirement_account_distributions, ...).

I looked into it but couldn't figure out the root cause. Wanted to flag it so you're aware.

@baogorek baogorek force-pushed the codex/fix-agi-geo-followup branch from deef418 to dfe2b48 Compare April 3, 2026 12:06
Copy link
Copy Markdown
Collaborator

@baogorek baogorek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@baogorek baogorek merged commit 3a9cb55 into main Apr 3, 2026
9 checks passed
@baogorek baogorek deleted the codex/fix-agi-geo-followup branch April 3, 2026 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants