Skip to content

⚡ Improve Metadata Handling for WSI Readers#1001

Open
shaneahmed wants to merge 15 commits intodevelopfrom
bug-fix-openslide-read
Open

⚡ Improve Metadata Handling for WSI Readers#1001
shaneahmed wants to merge 15 commits intodevelopfrom
bug-fix-openslide-read

Conversation

@shaneahmed
Copy link
Member

@shaneahmed shaneahmed commented Feb 20, 2026

Summary

This PR standardises and improves metadata inference across all WSI readers by introducing a unified mechanism for estimating missing objective power and MPP. It updates all major reader implementations (TIFF, DICOM, OpenSlide, JP2, NGFF, fsspec), fixes reader‑selection ordering, and adds extensive tests to validate inference behaviour and warnings. New sample data is included to support expanded DICOM metadata coverage.

🔑 Key Changes

1. Centralised Metadata Inference

  • Introduces WSIReader._estimate_mpp_objective_power() as the shared method for inferring missing objective power and MPP.
  • Removes duplicated inference logic and ensures consistent fallback behaviour across all readers.

2. Unified Metadata Handling Across Readers

All major WSI readers now use the central inference method:

  • TIFFWSIReader
  • DICOMWSIReader
  • OpenSlideWSIReader
  • JP2WSIReader
  • NGFFWSIReader
  • FsspecJsonWSIReader

This ensures consistent behaviour when metadata is missing or partially defined.

3. Improved Reader Selection Logic

  • Adds try_openslide() and updates selection priority so TIFF files are first attempted via OpenSlide.
  • Fixes misclassification issues where TIFF inputs were incorrectly routed to other readers.

4. Expanded and Strengthened Test Coverage

New and updated tests now cover:

  • Missing or partial OME‑TIFF metadata
  • Missing MPP (X/Y)
  • Missing instrument references
  • Warning behaviour when inference is required
  • DICOM metadata with and without optical path information
  • New dicom-2 sample with known objective/MPP values

Assertions have been updated to reflect the new inference logic.

5. Updated Remote Sample Data

  • Replaces CMU-1.dicom.zip with CMU-1-Small-Region.dicom.zip.
  • Adds new dicom-2 sample (JP2K-33003-1.zip) to support metadata‑specific tests.

6. Cleanup and Minor Fixes

  • Corrects import path for TransformedWSIReader.
  • Improves type hints in objective_power2mpp.
  • Normalises ndarray conversion for inferred MPP values.
  • Cleans up mypy issues related to dimension and metadata handling.

This PR resolves Jupyter Notebook 10 – WSI Reading (#998) and KongNet Notebook for MONKEY dataset (#987).

- Try OpenSlide Reader for tiff files first
- Fallback to calculating objective power from mpp
@shaneahmed shaneahmed self-assigned this Feb 20, 2026
@shaneahmed shaneahmed added this to the Release v2.0.0 milestone Feb 20, 2026
@shaneahmed shaneahmed added bug Something isn't working dev tools Changes/Updates in Development tools labels Feb 20, 2026
@shaneahmed shaneahmed requested a review from measty February 20, 2026 12:36
@codecov
Copy link

codecov bot commented Feb 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.53%. Comparing base (fbf8bc6) to head (4e04fd1).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #1001   +/-   ##
========================================
  Coverage    99.53%   99.53%           
========================================
  Files           83       83           
  Lines        11353    11397   +44     
  Branches      1493     1499    +6     
========================================
+ Hits         11300    11344   +44     
  Misses          28       28           
  Partials        25       25           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@shaneahmed shaneahmed changed the title 🐛 Try OpenSlideWSIReader for tiff Files First ⚡ Improve Metadata Handling for WSI Readers Mar 5, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes metadata inference (objective power and MPP) across WSI readers by introducing a shared inference helper and updating multiple reader implementations and tests. It also adjusts reader-selection priority so TIFF inputs are attempted via OpenSlide first.

Changes:

  • Add a centralized WSIReader._estimate_mpp_objective_power() and use it across multiple readers.
  • Update reader selection to try OpenSlide first for .tif/.tiff.
  • Update remote samples and expand tests for DICOM/TIFF metadata inference and warning behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tiatoolbox/wsicore/wsireader.py Adds OpenSlide-first selection for TIFF and introduces centralized MPP/objective-power inference used across readers.
tiatoolbox/utils/misc.py Broadens typing for objective_power2mpp to accept np.ndarray.
tiatoolbox/data/remote_samples.yaml Updates DICOM sample filename and adds a second DICOM sample entry for new metadata tests.
tests/test_wsireader.py Adds/updates DICOM metadata assertions and adds a new test for objective-power presence/inference.
tests/test_tiffreader.py Updates OME-TIFF metadata tests and adds a warning-logging test for missing metadata.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@measty measty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a regression on that ome.tiff you sent me a while ago to check multi-channel reader on (20240625_144804_1_08TcnB_Kidney_panel_June_RP_52top51bottom.ome.tiff).

In develop, pyramid is seen:

Image

Whereas opening the same slide in this PR:

Image

No pyramid is seen so it is slow and uses loads of memory, and the slide doesn't display right (seems to be black & white?)

@shaneahmed
Copy link
Member Author

This introduces a regression on that ome.tiff you sent me a while ago to check multi-channel reader on (20240625_144804_1_08TcnB_Kidney_panel_June_RP_52top51bottom.ome.tiff).

In develop, pyramid is seen:

Image Whereas opening the same slide in this PR: Image No pyramid is seen so it is slow and uses loads of memory, and the slide doesn't display right (seems to be black & white?)

Is this because of openslide reader? Probably, we can now remove openslidereader first as now meta data is handled better.

@shaneahmed
Copy link
Member Author

This introduces a regression on that ome.tiff you sent me a while ago to check multi-channel reader on (20240625_144804_1_08TcnB_Kidney_panel_June_RP_52top51bottom.ome.tiff).

In develop, pyramid is seen:

Image Whereas opening the same slide in this PR: Image No pyramid is seen so it is slow and uses loads of memory, and the slide doesn't display right (seems to be black & white?)

This introduces a regression on that ome.tiff you sent me a while ago to check multi-channel reader on (20240625_144804_1_08TcnB_Kidney_panel_June_RP_52top51bottom.ome.tiff).

In develop, pyramid is seen:

Image Whereas opening the same slide in this PR: Image No pyramid is seen so it is slow and uses loads of memory, and the slide doesn't display right (seems to be black & white?)

@measty This commit 4c0ba9a resolves this issue. I have tested the WSI Registration notebook and the mIF images, both work fine now. However, it fails on MONKEY challenge image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working dev tools Changes/Updates in Development tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants