-
Notifications
You must be signed in to change notification settings - Fork 51
Chunkwise image loader #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Chunkwise image loader #279
Conversation
… image-reader-chunkwise
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #279 +/- ##
==========================================
+ Coverage 55.16% 55.96% +0.80%
==========================================
Files 26 27 +1
Lines 2844 2898 +54
==========================================
+ Hits 1569 1622 +53
- Misses 1275 1276 +1
🚀 New features to boost your workflow:
|
melonora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! I have 2 minor suggestions. I also saw that you use the width by height convention. Personally, I don't have a strong opinion here, though we could also stick to array api conventions. @LucaMarconato WDYT? Pre-approving for now.
melonora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry had to change due to rethinking memmap. This does not always work, for example when dealing with compressed tiffs as far as I am aware.
Co-authored-by: Wouter-Michiel Vierdag <w-mv@hotmail.com>
|
Hi @LucaMarconato - Just saw that you pushed some updates. Could you comment on the current implementation/is this PR still interesting for you? |
|
@lucas-diedrich yes still interested in it, I have time now to go through it. I added my "code review" in terms of "TODO" items for myself. I'll check the code and push some modifications if I see that some minor fixes are needed; I'll comment eventual larger changes, but the code looks good. I'll also do a benchmark with |
|
There was a problem with the dimension of the returned data. |
|
@lucas-diedrich I'm done with edits, please double check the changes if you have time. The main comment, added in the notes in In summary, redefining I will now do the benchmark with |
Description
This PR addresses the challenge that the currently implemented and planned image loaders require loading imaging data entirely into memory, typically as NumPy arrays. Given the large size of microscopy datasets, this is not always feasible.
To mitigate this issue, and as discussed with @LucaMarconato, this PR aims to introduce a generalizable approach for reading large microscopy files in chunks, enabling efficient handling of data that does not fit into memory.
Some related discussions.
Strategy
In this PR, we focus on
.tiffimages, as implemented in the_tiff_to_chunksfunction.tifffile.memmap)_compute_chunks)dask.arraywhich is memory-mapped and avoids memory overflow (_read_chunks)dask.array(viadask.array.block)The strategy is implemented in
src/spatialdata_io/readers/generic.pyandsrc/spatialdata_io/readers/_utils/_image.pyFuture extensions
The strategy can be implemented for any image type, as long as it is possible to implement
We have implemented similar readers for openslide-compatible whole slide images and the Carl-Zeiss microscopy format.