Skip to content

Conversation

@anth-volk
Copy link
Contributor

Fixes #222

Summary

  • Add support for filtering US simulations by Census place (city/town)
  • Enable city-level impact analysis for 333 places with population > 100,000

Changes

policyengine/utils/data/datasets.py

  • Added "place" to USRegionType literal type
  • Added "place" to US_REGION_PREFIXES tuple
  • Added parse_us_place_region() helper function to extract state code and place FIPS from region strings
  • Added place handling in _get_default_us_dataset() to load the parent state's dataset

policyengine/simulation.py

  • Added dispatch for "place/" regions in _apply_us_region_to_simulation()
  • Added _filter_us_simulation_by_place() method that filters households by place_fips

Region String Format

  • Format: place/{STATE_ABBREV}-{PLACE_FIPS}
  • Example: place/NJ-57000 for Paterson, NJ

How It Works

  1. When a place region is selected, the system loads the parent state's dataset
  2. Households are filtered by matching the place_fips variable in the dataset
  3. The 100k population cutoff ensures sufficient sample sizes (500+ unweighted households)

Test plan

  • Test place filtering with NJ dataset for Paterson (place/NJ-57000)
  • Verify household counts match expected sample sizes
  • Test with various state/place combinations

🤖 Generated with Claude Code

- Add "place" to USRegionType and US_REGION_PREFIXES
- Add parse_us_place_region() helper to parse "place/NJ-57000" format
- Add _filter_us_simulation_by_place() method to filter by place_fips
- Load state dataset when place region is specified, then filter

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@anth-volk anth-volk changed the base branch from main to 0.x February 3, 2026 18:26
anth-volk and others added 8 commits February 3, 2026 21:33
- Add tests for determine_us_region_type with place regions
- Add tests for parse_us_place_region function
- Add tests for _get_default_us_dataset with place regions
- Add tests for _filter_us_simulation_by_place method
- Add mini dataset fixture for integration-style place filtering tests
- Cover edge cases: bytes vs str place_fips, empty results, multiple places

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create tests/fixtures/country/us_places.py with:
  - create_mock_simulation_with_place_fips() helper
  - create_mock_simulation_with_bytes_place_fips() helper
  - mini_place_dataset pytest fixture
  - mini_place_dataset_with_bytes pytest fixture
- Update tests/fixtures/__init__.py to export new fixtures
- Update tests/conftest.py to register fixtures globally
- Refactor test_us_places.py to import from fixtures

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extract hardcoded values from tests into fixtures:
- Place FIPS constants (NJ_PATERSON_FIPS, etc.)
- Region string constants (NJ_PATERSON_REGION, etc.)
- Test data arrays (MIXED_PLACES_WITH_PATERSON, etc.)
- Expected results constants (EXPECTED_PATERSON_COUNT_IN_MIXED, etc.)
- Pre-configured mock fixtures for common test scenarios

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The "city" region type (previously used only for NYC) has been removed.
Place-level filtering using "place/" prefix is now the standard approach
for sub-state geographic filtering.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix place filtering to use map_to="person" for correct entity mapping
- Add validation in parse_us_place_region() for:
  - Empty state codes
  - Empty place FIPS codes
  - Missing dash separator
  - Wrong prefix

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add changelog entry for place-level impact analysis feature
- Format code with black

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The conftest.py was importing SimulationOptions from fixtures at module
level, which triggered loading policyengine_us and policyengine_uk
before any tests ran. This consumed significant memory at test
collection time.

Changed tests/fixtures/simulation.py to use lazy getter functions
instead of module-level variables that call SimulationOptions.model_validate().
Updated tests/test_simulation.py to use the new getter functions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
anth-volk and others added 2 commits February 4, 2026 02:09
The national ECPS_2024 dataset is too large for the CI runner's memory
(7GB). Changed tests to use Delaware state data instead, which tests
the same simulation and comparison pipeline with much smaller memory
footprint.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added lazy import inside the test function.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@anth-volk anth-volk marked this pull request as ready for review February 4, 2026 00:21
@anth-volk anth-volk merged commit 34beb0f into 0.x Feb 4, 2026
3 checks passed
@anth-volk anth-volk deleted the feat/add-places branch February 4, 2026 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add place-level (city) filtering for US impact analysis

2 participants