Zero out fuel spending for non-fuel households by MaxGhenis · Pull Request #247 · PolicyEngine/policyengine-uk-data

MaxGhenis · 2025-12-03T23:16:45Z

Summary

Households with has_fuel_consumption=0 (non-vehicle owners or EV owners) now have petrol_spending and diesel_spending set to zero after imputation
This prevents the QRF from assigning fuel spending based on other predictors to households that shouldn't have any fuel consumption
Should reduce the fuel duty over-estimation seen in calibration (was £49.5B vs £24.4B target)

Context

After PR #244 added has_fuel_consumption as a predictor, fuel duties were over-estimated by ~100%. This happened because:

The QRF could still assign fuel spending to non-fuel households based on other predictors
The calibration couldn't reduce this because it conflicts with other targets

Test plan

CI passes
Fuel duty estimate should be closer to £24.4B OBR target

🤖 Generated with Claude Code

Households with has_fuel_consumption=0 (non-vehicle owners or EV owners) now have petrol_spending and diesel_spending set to zero after imputation. This prevents the QRF from assigning fuel spending based on other predictors to households that shouldn't have any. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

nikhilwoodruff · 2025-12-07T09:13:21Z

@PolicyEngine can you PTAL at this and see if you can get it to a point we can merge

policyengine

This PR correctly addresses the fuel duty over-estimation issue by zeroing out fuel spending for non-fuel households after QRF imputation. The logic is sound and follows the existing pattern in the codebase.

What's good:

The fix is minimal and targeted - just 6 lines of code
The logic correctly uses the already-computed has_fuel_consumption variable
Documentation is updated to explain the new step
The approach makes sense: QRF is a continuous model, so post-hoc zeroing for categorical exclusions is reasonable

Minor consideration:

The code assumes has_fuel_consumption is still in scope at line 347. Looking at the context, it's computed around line 336-340 before imputation, so this should work fine.

The fix should significantly reduce the fuel duty over-estimation from £49.5B towards the £24.4B target, since ~30% of households (non-vehicle owners + EV owners) will no longer incorrectly contribute to fuel spending.

policyengine · 2025-12-07T09:15:10Z

policyengine_uk_data/datasets/imputations/consumption.py

    for column in output_df.columns:
        dataset.household[column] = output_df[column].values

+    # Zero out fuel spending for households without fuel consumption


Quick verification: has_fuel_consumption is computed earlier in this function (around line 336-340) and is a numpy array that persists in scope. The == 0 comparison is correct for boolean-like integers. This looks good.

policyengine · 2025-12-07T09:15:10Z

policyengine_uk_data/datasets/imputations/consumption.py

        dataset.household[column] = output_df[column].values

+    # Zero out fuel spending for households without fuel consumption
+    # This ensures only ICE vehicle owners contribute to fuel duty


This direct assignment to a subset of dataset.household[...] works because the household dict stores numpy arrays. The boolean mask indexing correctly updates only the non-fuel households in place.

policyengine · 2025-12-07T09:15:10Z

docs/imputations.md

@@ -106,6 +106,8 @@ LCFS 2-week diaries undercount fuel purchasers (58%) compared to actual vehicle

 4. **At FRS imputation time**: Compute `has_fuel_consumption` directly from `num_vehicles` (already calibrated to NTS targets)



Good addition - the documentation clearly explains what happens and why.

policyengine bot approved these changes Dec 7, 2025

View reviewed changes

nikhilwoodruff merged commit 827d9de into main Dec 7, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zero out fuel spending for non-fuel households#247

Zero out fuel spending for non-fuel households#247
nikhilwoodruff merged 1 commit intomainfrom
fix-fuel-spending-zeroing

MaxGhenis commented Dec 3, 2025

Uh oh!

nikhilwoodruff commented Dec 7, 2025

Uh oh!

policyengine bot left a comment

Uh oh!

policyengine bot Dec 7, 2025

Uh oh!

policyengine bot Dec 7, 2025

Uh oh!

policyengine bot Dec 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -106,6 +106,8 @@ LCFS 2-week diaries undercount fuel purchasers (58%) compared to actual vehicle

		4. At FRS imputation time: Compute `has_fuel_consumption` directly from `num_vehicles` (already calibrated to NTS targets)

Conversation

MaxGhenis commented Dec 3, 2025

Summary

Context

Test plan

Uh oh!

nikhilwoodruff commented Dec 7, 2025

Uh oh!

policyengine bot left a comment

Choose a reason for hiding this comment

Uh oh!

policyengine bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

policyengine bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

policyengine bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants