You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Math fix in _build_agi_block_probs: The old code (pop_probs * agi_weights / sum) conflated population distribution with AGI targets — a district with 90% of blocks but a low AGI target would still dominate. Dividing by within-district population mass first makes district totals proportional to just the AGI targets while preserving within-district block shares. The new test case makes the expected behavior clear.
db_path shadowing fix: Line 944 on main hardcodes db_path = str(STORAGE_FOLDER / "calibration" / "policy_data.db"), which shadows the db_path parameter. Moving UnifiedMatrixBuilder construction earlier and routing through get_district_agi_targets() fixes this cleanly.
Questions / minor notes
Silent filter changes in get_district_agi_targets vs the old raw SQL. The old query returned all reform IDs, all domain scopes, and didn't do best-period selection (the dict just kept the last row arbitrarily). The new method via _query_targets adds three filters that weren't there before: baseline only (reform_id == 0), unconditional only (empty domain_variable), and most-recent-period selection. All three seem like improvements, but the PR description doesn't mention them — were these intentional?
builder.dataset_path mutation (line ~1027 in the new code): the builder is constructed without dataset_path, then it's set as an attribute later. This works fine today since __init__ just stores it, but it's a minor fragility — if construction-time validation is ever added, this pattern would break. Not a blocker, just flagging it.
ValueError: Cannot downsample cps_2021: found 9 dataset variables missing from the current country package (count_under_18, count_under_6, hourly_wage, is_union_member_or_covered, other_type_retirement_account_distributions, ...).
I looked into it but couldn't figure out the root cause. Wanted to flag it so you're aware.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
UnifiedMatrixBuilderusing the caller-provided calibration databaseTesting
uv run pytest policyengine_us_data/tests/test_calibration/test_clone_and_assign.py policyengine_us_data/tests/test_calibration/test_unified_matrix_builder.py policyengine_us_data/tests/test_calibration/test_unified_calibration.py -quv run ruff format --check policyengine_us_data/calibration/clone_and_assign.py policyengine_us_data/calibration/unified_calibration.py policyengine_us_data/calibration/unified_matrix_builder.py policyengine_us_data/tests/test_calibration/test_clone_and_assign.py policyengine_us_data/tests/test_calibration/test_unified_calibration.py policyengine_us_data/tests/test_calibration/test_unified_matrix_builder.pyFollow-up to #671.