Fix salary sacrifice headcount regression from uprating mismatch by vahid-ahmadi · Pull Request #270 · PolicyEngine/policyengine-uk-data

vahid-ahmadi · 2026-02-17T14:53:46Z

Problem

PR #268 introduced a salary sacrifice headcount regression in v1.37.0. The reported numbers shifted from correct (above-cap 3.37M, below-cap 1.12M) to wrong (above-cap 1.89M, below-cap 6.5M).

Root cause: PR #268 added two things simultaneously:

Stage 2 imputation that creates ~5,500 SS records at exactly 2,000 (the cap boundary)
Headcount calibration targets (3.3M above-cap, 4.3M below-cap)

Two bugs:

Uprating mismatch: The calibrator classified above/below 2k at uprated 2025 prices, but the saved h5 stores 2023 prices where 8.55% inflation flips boundary records. PE does not uprate SS when loading, so end-users see the 2023-price classification which disagrees with what the calibrator optimised for.
All Stage 2 records below-cap: np.minimum(employee_pension, 2000.0) capped every new SS record at 2,000, putting them all at/below the cap. At 2023 prices only ~1,308 QRF records were above-cap - not enough for the calibrator to reach the 3.3M above-cap target.

Fix

loss.py: Deflate SS amounts to 2023-24 base-year prices before applying the 2,000 threshold, so the calibrator uses the same above/below classification that end-users will see
salary_sacrifice.py: Stage 2 now moves the full employee pension amount to SS (instead of capping at 2,000), so donors with pension > 2,000 become above-cap records and the rest below-cap. Target set to ~70% of 7.7M total headcount so the calibrator upweights gently
test_salary_sacrifice_headcount.py: Remove xfail markers so the tests provide actual signal

Test plan

CI builds dataset and calibration log shows correct above/below split
Salary sacrifice headcount tests pass (no longer xfail)
Verify on HuggingFace that loaded dataset produces ~3.3M above-cap and ~4.3M below-cap

PR #268 added Stage 2 imputation (records at exactly 2k) and headcount calibration targets. The calibrator classified above/below 2k at uprated 2025 prices, but the saved h5 stores 2023 prices where the classification differs (8.55% inflation flips boundary records). Now evaluates SS at base-year prices before thresholding so calibration matches end-user values. Also reduces Stage 2 target to 3M so the calibrator upweights gently rather than overshooting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…cords The previous Stage 2 capped all new SS records at 2k, putting them all below-cap. At 2023 prices only ~1,308 QRF records were above-cap, not enough for the calibrator to reach 3.3M. Now donors keep their full employee pension amount so the natural above/below split provides records in both categories. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vahid-ahmadi and others added 2 commits February 17, 2026 14:53

vahid-ahmadi merged commit c65b096 into main Feb 17, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix salary sacrifice headcount regression from uprating mismatch#270

Fix salary sacrifice headcount regression from uprating mismatch#270
vahid-ahmadi merged 2 commits intomainfrom
fix/salary-sacrifice-regression

vahid-ahmadi commented Feb 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vahid-ahmadi commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vahid-ahmadi commented Feb 17, 2026 •

edited

Loading