Add 3D mask visualization capabilities by d33bs · Pull Request #200 · cytomining/CytoDataFrame

d33bs · 2026-03-08T21:21:17Z

Description

This PR adds the ability for CytoDataFrame to visualize 3d masks.

Closes #185

What kind of change(s) are included?

Documentation (changes docs or other related content)
Bug fix (fixes an issue).
Enhancement (adds functionality).
Breaking change (these changes would cause existing functionality to not work as expected).

Checklist

Please ensure that all boxes are checked before indicating that this pull request is ready for review.

I have read and followed the CONTRIBUTING.md guidelines.
I have searched for existing content to ensure this is not a duplicate.
I have performed a self-review of these additions (including spelling, grammar, and related).
These changes pass all pre-commit checks.
I have added comments to my code to help provide understanding
I have added a test which covers the code changes found within this PR
I have deleted all non-relevant text in this pull request template.

Summary by CodeRabbit

New Features
- Render 3D segmentation masks as configurable overlays in 3D views and static snapshots, with a UI toggle; overlay color/opacity defaults and inline payload support included. Inline size checks now account for label data.
Documentation
- Example updated to show supplying 3D mask context.
- Minor docstring typo fixed.
Tests
- Expanded tests for 3D label overlays, cropping, rendering modes, toggles, and snapshot outputs.
Chores
- Updated pre-commit tool versions and added a pre-build step to create output/masks; coverage badge hook set to manual.

coderabbitai · 2026-03-08T21:21:37Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds discovery, preparation, and rendering of 3D segmentation/label overlays: finds segmentation files, prepares uint8 label volumes (optional bbox crop), propagates label_volume through PyVista/VTK HTML rendering and snapshots, updates example, pre-commit config, test fixtures, and adds tests with fake PyVista.

Changes

Cohort / File(s)	Summary
Pre-commit Configuration `\.pre-commit-config.yaml`	Bumped hook versions and added `manual` stage to the coverage-badge hook.
Core 3D overlay implementation `src/cytodataframe/frame.py`	Added segmentation discovery and overlay helpers (`_find_matching_segmentation_path`, `_prepare_3d_label_overlay`, `_get_3d_label_overlay_from_cell`, `_get_3d_bbox_crop_bounds`), overlay rendering/UI helpers, and extended viewer/snapshot builders to accept and propagate `label_volume` with overlay toggle UI.
3D HTML/VTK pipeline `src/cytodataframe/volume.py`	Extended `build_3d_image_html_view` and VTK/JS pipeline to accept, encode, and render `label_volume` (color/opacity attrs); adjusted inline-size checks to account for label payload.
Examples `docs/src/examples/cytodataframe_at_a_glance.py`	Updated 3D example to optionally pass `data_mask_context_dir` so masks can be provided alongside 3D image volumes.
Test data / scripts `tests/data/CP_tutorial_3D_noise_nuclei_segmentation/...`, `tests/data/cytotable/.../create_image_data.py`	Pipeline files updated to produce segmentation masks (16-bit) and run script ensures `output/masks` dir exists; minor docstring fix.
Tests `tests/test_frame.py`, `tests/test_volume.py`	Added tests for segmentation matching, bbox cropping, uint8 label overlays, propagation into PyVista/VTK HTML snapshots, overlay toggle behavior; introduced fake PyVista fixtures to capture rendering interactions.

Sequence Diagram

sequenceDiagram
    participant User as CytoDataFrame User
    participant Frame as frame.py
    participant Finder as Segmentation Finder
    participant Prep as Overlay Prep
    participant Viewer as PyVista Viewer
    participant Volume as volume.py (VTK/JS)

    User->>Frame: request 3D cell view (row, column, shape)
    Frame->>Finder: _find_matching_segmentation_path(data_value,...)
    Finder-->>Frame: segmentation_path or None
    Frame->>Prep: _prepare_3d_label_overlay(segmentation_path, expected_shape, row)
    Prep-->>Frame: label_volume (uint8) or None
    Frame->>Viewer: _build_pyvista_viewer(volume, label_volume, ...)
    Viewer->>Volume: build_3d_image_html_view(volume, dims, ..., label_volume)
    Volume->>Volume: encode volume + label_volume (base64) + set data-label-* attrs
    Volume->>Viewer: return HTML/JS to render volume + overlay
    Viewer-->>User: rendered 3D view with optional mask overlay and toggle UI

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Enable 3D visualizations with CytoDataFrame #186: Modifies the same 3D visualization pipelines (frame.py and volume.py) to add label/mask overlay rendering and mask-context handling.

Suggested reviewers

jenna-tomkinson

Poem

🐰
I hopped through voxels, bright and neat,
Found masks that made the layers meet,
Green outlines traced each tiny cell,
A toggle click — the view will tell,
Hooray, the 3D sight is sweet!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 21.11% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add 3D mask visualization capabilities' directly summarizes the main change: introducing support for visualizing 3D segmentation masks in CytoDataFrame.
Linked Issues check	✅ Passed	The PR implements 3D mask/outline visualization support for CytoDataFrame across frame.py, volume.py, and examples, directly addressing the requirement in issue `#185`.
Out of Scope Changes check	✅ Passed	All changes are directly related to 3D mask visualization: core implementation (frame.py, volume.py), documentation/examples, test infrastructure, and dependency updates for tool compatibility.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/cytodataframe/volume.py (1)

114-157: ⚠️ Potential issue | 🟠 Major

Count label_volume in the inline-size limit.

The guard on Line 115 only budgets volume_uint8.nbytes. When label_volume is present, this path embeds a second full volume, so near-threshold 3D cells can still inline far more data than max_inline_volume_bytes is supposed to allow.

Proposed fix

     volume_uint8 = np.array(volume, dtype=np.uint8, copy=True)
-    if volume_uint8.nbytes > max_inline_volume_bytes:
+    label_binary: Optional[np.ndarray] = None
+    if label_volume is not None:
+        label_arr = np.asarray(label_volume)
+        if label_arr.shape == volume_uint8.shape:
+            label_binary = np.where(label_arr > 0, 255, 0).astype(
+                np.uint8, copy=False
+            )
+
+    inline_bytes = volume_uint8.nbytes
+    if label_binary is not None:
+        inline_bytes += label_binary.nbytes
+
+    if inline_bytes > max_inline_volume_bytes:
         return build_3d_image_html_stub(
             data_value=data_value,
             candidate_path=candidate_path,
             display_options=display_options,
             message=(
                 "3D image too large for inline rendering "
-                f"({volume_uint8.nbytes} bytes > {max_inline_volume_bytes} bytes)"
+                f"({inline_bytes} bytes > {max_inline_volume_bytes} bytes)"
             ),
         )
@@
-    if label_volume is not None:
-        label_arr = np.asarray(label_volume)
-        if label_arr.shape == volume_uint8.shape:
-            label_binary = np.where(label_arr > 0, 255, 0).astype(np.uint8, copy=False)
+    if label_binary is not None:
             label_b64 = base64.b64encode(label_binary.tobytes()).decode("utf-8")
             overlay_color = display_options.get("label_overlay_color", (0, 255, 0))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cytodataframe/volume.py` around lines 114 - 157, The inline-size check
only accounts for volume_uint8.nbytes and ignores label_volume, so when a label
is present the combined inline payload can exceed max_inline_volume_bytes;
update the logic around the size guard to also compute the label byte size
(e.g., convert label_volume -> np.asarray -> binary mask np.where(label_arr > 0,
255, 0).astype(np.uint8, copy=False) and get .nbytes) and compare
volume_uint8.nbytes + label_nbytes against max_inline_volume_bytes, returning
build_3d_image_html_stub with an updated message if the combined size is too
large; keep the existing conversion/code paths (label_arr, label_binary,
label_b64, label_color_attr, label_opacity_attr) but move or duplicate the
minimal label-size computation before the size check to enforce the budget.

🧹 Nitpick comments (4)

tests/test_volume.py (1)

71-90: Add a near-threshold label_volume case.

This covers the happy path, but the new overlay path also increases the inline payload substantially. A regression in max_inline_volume_bytes handling would still pass unless this suite includes a stub-fallback case with both volume and label_volume.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_volume.py` around lines 71 - 90, Add a second test variant in
tests/test_volume.py that builds both a volume and a label_volume sized near the
max_inline_volume_bytes threshold to exercise the inline-vs-fallback path; call
build_3d_image_html_view with both volume and label_volume (same pattern as
test_build_3d_image_html_view_includes_label_overlay_data) but set the arrays'
sizes so their combined inline payload is just below/at/just above
max_inline_volume_bytes, then assert the function correctly handles the
stub-fallback case by checking for the expected inline label overlay attributes
(data-label-volume, data-label-color, data-label-opacity) when inlined and/or
the fallback indicator (absence of data-label-volume or a reference URL) when it
falls back. Ensure the new test references build_3d_image_html_view and the
max_inline_volume_bytes limit so the regression in max_inline_volume_bytes
handling is covered.

src/cytodataframe/frame.py (3)

2527-2529: Redundant variable assignment.

display_options is already defined at line 2434 with the same expression. This inner assignment shadows the outer variable unnecessarily.

♻️ Proposed fix

                     label_grid.point_data["label_scalars"] = np.asfortranarray(
                         label_xyz
                     ).ravel(order="F")
-                    display_options = (
-                        self._custom_attrs.get("display_options", {}) or {}
-                    )
                     overlay_mode = str(
                         display_options.get("label_overlay_mode", "surface")
                     ).lower()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cytodataframe/frame.py` around lines 2527 - 2529, The inner redundant
assignment to display_options (using self._custom_attrs.get("display_options",
{}) or {}) should be removed to avoid shadowing the outer variable already
defined earlier; locate the duplicate assignment within the same scope (the one
that reads self._custom_attrs.get("display_options", {}) or {}) and delete that
inner line so all uses reference the original display_options defined at line
~2434, leaving any subsequent logic unchanged.

3503-3601: Significant code duplication with _build_pyvista_viewer.

The label overlay rendering logic (lines 3503-3601) is nearly identical to lines 2512-2624 in _build_pyvista_viewer. Both handle shape validation, coordinate transposition, overlay mode selection, and mesh/volume addition identically.

Consider extracting a shared helper method to reduce maintenance burden and ensure consistent behavior:

♻️ Suggested approach

def _add_label_overlay_to_plotter(
    self,
    plotter: Any,
    volume: np.ndarray,
    label_volume: np.ndarray,
    spacing: Tuple[float, float, float],
    base_sample: float,
) -> None:
    """Add label overlay to a PyVista plotter instance."""
    display_options = self._custom_attrs.get("display_options", {}) or {}
    # ... shared overlay rendering logic ...

This would allow both _build_pyvista_viewer and _pyvista_volume_snapshot_html to call this helper, reducing ~100 lines of duplication.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cytodataframe/frame.py` around lines 3503 - 3601, The label overlay
rendering block duplicated between _pyvista_volume_snapshot_html and
_build_pyvista_viewer should be extracted into a single helper (suggest name
_add_label_overlay_to_plotter) that accepts the plotter, volume, label_volume,
spacing, base_sample and display_options (pulled from self._custom_attrs) and
implements the shared logic: shape validation, transposition, creation of
pv.ImageData, setting point_data, handling overlay_mode "surface" vs volume,
color/opacity normalization, adding meshes/volume, and logging; replace the
duplicated blocks in both _pyvista_volume_snapshot_html and
_build_pyvista_viewer with calls to this helper to centralize behavior and
reduce maintenance burden.

2264-2268: Consider using logger.debug instead of logger.info for routine operations.

The logger.info calls at lines 2264 and 2296 will produce output for every successful overlay lookup. In production with many cells, this could be verbose. Reserve logger.info for significant events and use logger.debug for tracing.

♻️ Proposed change

-        logger.info(
+        logger.debug(
             "Found 3D mask/outline for image %s at %s",
             data_value,
             segmentation_path,
         )

-            logger.info(
+            logger.debug(
                 "Prepared 3D mask/outline overlay for image %s with shape %s",
                 data_value,
                 overlay.shape,
             )

Also applies to: 2296-2300

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/cytodataframe/frame.py` around lines 2264 - 2268, Replace the routine
informational logs that announce successful overlay/segmentation lookups from
logger.info to logger.debug to avoid noisy production logs: specifically change
the logger.info call that logs ("Found 3D mask/outline for image %s at %s",
data_value, segmentation_path) and the similar logger.info call around
overlay_path (the second occurrence referenced in the comment) to logger.debug
so these routine per-cell successes are only emitted at debug level.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/src/examples/cytodataframe_at_a_glance.py`:
- Around line 357-364: The inline comment above the CytoDataFrame instantiation
is too long (triggers E501); split or shorten the comment text so no line
exceeds the linter limit—e.g., break "# read 3d images with segmentation masks
and show how the segmentation masks are also 3D." into two or more shorter
comment lines or reword it to be concise; ensure the surrounding code that
constructs CytoDataFrame (data=..., data_context_dir, data_mask_context_dir) and
the use of cp_3d_path remain unchanged.

---

Outside diff comments:
In `@src/cytodataframe/volume.py`:
- Around line 114-157: The inline-size check only accounts for
volume_uint8.nbytes and ignores label_volume, so when a label is present the
combined inline payload can exceed max_inline_volume_bytes; update the logic
around the size guard to also compute the label byte size (e.g., convert
label_volume -> np.asarray -> binary mask np.where(label_arr > 0, 255,
0).astype(np.uint8, copy=False) and get .nbytes) and compare volume_uint8.nbytes
+ label_nbytes against max_inline_volume_bytes, returning
build_3d_image_html_stub with an updated message if the combined size is too
large; keep the existing conversion/code paths (label_arr, label_binary,
label_b64, label_color_attr, label_opacity_attr) but move or duplicate the
minimal label-size computation before the size check to enforce the budget.

---

Nitpick comments:
In `@src/cytodataframe/frame.py`:
- Around line 2527-2529: The inner redundant assignment to display_options
(using self._custom_attrs.get("display_options", {}) or {}) should be removed to
avoid shadowing the outer variable already defined earlier; locate the duplicate
assignment within the same scope (the one that reads
self._custom_attrs.get("display_options", {}) or {}) and delete that inner line
so all uses reference the original display_options defined at line ~2434,
leaving any subsequent logic unchanged.
- Around line 3503-3601: The label overlay rendering block duplicated between
_pyvista_volume_snapshot_html and _build_pyvista_viewer should be extracted into
a single helper (suggest name _add_label_overlay_to_plotter) that accepts the
plotter, volume, label_volume, spacing, base_sample and display_options (pulled
from self._custom_attrs) and implements the shared logic: shape validation,
transposition, creation of pv.ImageData, setting point_data, handling
overlay_mode "surface" vs volume, color/opacity normalization, adding
meshes/volume, and logging; replace the duplicated blocks in both
_pyvista_volume_snapshot_html and _build_pyvista_viewer with calls to this
helper to centralize behavior and reduce maintenance burden.
- Around line 2264-2268: Replace the routine informational logs that announce
successful overlay/segmentation lookups from logger.info to logger.debug to
avoid noisy production logs: specifically change the logger.info call that logs
("Found 3D mask/outline for image %s at %s", data_value, segmentation_path) and
the similar logger.info call around overlay_path (the second occurrence
referenced in the comment) to logger.debug so these routine per-cell successes
are only emitted at debug level.

In `@tests/test_volume.py`:
- Around line 71-90: Add a second test variant in tests/test_volume.py that
builds both a volume and a label_volume sized near the max_inline_volume_bytes
threshold to exercise the inline-vs-fallback path; call build_3d_image_html_view
with both volume and label_volume (same pattern as
test_build_3d_image_html_view_includes_label_overlay_data) but set the arrays'
sizes so their combined inline payload is just below/at/just above
max_inline_volume_bytes, then assert the function correctly handles the
stub-fallback case by checking for the expected inline label overlay attributes
(data-label-volume, data-label-color, data-label-opacity) when inlined and/or
the fallback indicator (absence of data-label-volume or a reference URL) when it
falls back. Ensure the new test references build_3d_image_html_view and the
max_inline_volume_bytes limit so the regression in max_inline_volume_bytes
handling is covered.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d9fddf98-8fe2-4790-b067-f52b1626c4e5

📥 Commits

Reviewing files that changed from the base of the PR and between c364801 and fdec51a.

⛔ Files ignored due to path filters (5)

media/coverage-badge.svg is excluded by !**/*.svg
tests/data/CP_tutorial_3D_noise_nuclei_segmentation/output/masks/nuclei1_out_c00_dr90_imageSegmentationMask.tiff is excluded by !**/*.tiff
tests/data/CP_tutorial_3D_noise_nuclei_segmentation/output/masks/nuclei2_out_c90_dr90_imageSegmentationMask.tiff is excluded by !**/*.tiff
tests/data/CP_tutorial_3D_noise_nuclei_segmentation/output/masks/nuclei3_out_c00_dr10_imageSegmentationMask.tiff is excluded by !**/*.tiff
tests/data/CP_tutorial_3D_noise_nuclei_segmentation/output/masks/nuclei4_out_c90_dr10_imageSegmentationMask.tiff is excluded by !**/*.tiff

📒 Files selected for processing (10)

.pre-commit-config.yaml
docs/src/examples/cytodataframe_at_a_glance.ipynb
docs/src/examples/cytodataframe_at_a_glance.py
src/cytodataframe/frame.py
src/cytodataframe/volume.py
tests/data/CP_tutorial_3D_noise_nuclei_segmentation/3d-nuclei-profiling.cppipe
tests/data/CP_tutorial_3D_noise_nuclei_segmentation/run.sh
tests/data/cytotable/NF1_cellpainting_data_shrunken/create_image_data.py
tests/test_frame.py
tests/test_volume.py

docs/src/examples/cytodataframe_at_a_glance.py

Copilot

Pull request overview

This PR adds the ability for CytoDataFrame to visualize 3D segmentation masks/outlines as overlays on top of 3D volume images. It supports both the VTK.js inline HTML path and the PyVista interactive/snapshot paths, and it includes a refactoring of bounding-box crop logic into a reusable helper method.

Changes:

Adds _find_matching_segmentation_path, _prepare_3d_label_overlay, _get_3d_label_overlay_from_cell, and _get_3d_bbox_crop_bounds methods to CytoDataFrame for 3D mask lookup and rendering.
Extends build_3d_image_html_view and _build_vtk_js_renderer_core in volume.py to embed label overlay data and render it via VTK.js.
Updates the CellProfiler pipeline config and test data to produce dedicated mask output files in a masks/ subdirectory.

Reviewed changes

Copilot reviewed 9 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`src/cytodataframe/frame.py`	Core changes: new 3D label overlay methods, refactored bbox crop helper, and label overlay wired into all 3D rendering paths (VTK HTML, PyVista viewer, PyVista snapshot)
`src/cytodataframe/volume.py`	Adds label overlay data embedding and VTK.js rendering script for the inline HTML 3D view
`tests/test_frame.py`	New tests for label overlay lookup, bbox crop alignment, and label overlay passing to PyVista viewers/snapshots
`tests/test_volume.py`	New test verifying label overlay data attributes in the generated HTML
`tests/data/CP_tutorial_3D_noise_nuclei_segmentation/3d-nuclei-profiling.cppipe`	Updated CellProfiler pipeline to save masks to a dedicated `masks/` folder with consistent naming
`tests/data/CP_tutorial_3D_noise_nuclei_segmentation/run.sh`	Creates the `output/masks/` folder before running the pipeline
`tests/data/CP_tutorial_3D_noise_nuclei_segmentation/output/masks/nuclei2_out_c90_dr90_imageSegmentationMask.tiff`	New binary test mask file produced by the updated pipeline
`tests/data/cytotable/NF1_cellpainting_data_shrunken/create_image_data.py`	Fixed typo: "puproses" → "purposes"
`docs/src/examples/cytodataframe_at_a_glance.py`	Adds a documentation example using `data_mask_context_dir` with 3D data
`.pre-commit-config.yaml`	Updates `codespell` and `ruff` versions; marks coverage badge as a manual stage
`media/coverage-badge.svg`	Updated coverage percentage from 89.97% to 90.10%

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/cytodataframe/frame.py

src/cytodataframe/volume.py

…to 3d-labels

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cytodataframe/frame.py`:
- Around line 1334-1339: The current loop returns the first file_pattern match
from root.rglob("*") as soon as original_pattern matches data_value, which can
return overlays from unrelated images; instead, constrain matching to files that
are tied to the current image identity. Change the search so that after
original_pattern matches data_value you derive an image-specific identifier (for
example via groups from original_pattern on data_value or the source image
Path/stem) and then search only within the same directory/parent tree (do not
use root.rglob("*") globally) or filter candidate files by that same identifier
before returning; update the loop using pattern_map, original_pattern,
data_value, file_pattern and the root.rglob call to enforce the image-level
correspondence rather than returning the first global regex hit.
- Around line 2240-2256: The path-resolution logic that normalizes a "file:"
URI, strips directories, and searches data_context_dir for matches (currently
inline around variables data_value, normalized, candidate_path, context_dir)
should be extracted into a shared helper (e.g., _resolve_volume_candidate or
similar) and used both here and from _get_3d_volume_from_cell(); implement the
helper to accept the raw value and self._custom_attrs (or context_dir) and
return a pathlib.Path candidate (handling "file:" prefix, extracting basename,
and rglob lookup in data_context_dir), then replace the inline block in this
method with a call to that helper and update _get_3d_volume_from_cell() to call
the same helper so overlays for 3D images resolve consistently.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d71c9f38-7bbc-45bb-bfec-47e0e12cd50f

📥 Commits

Reviewing files that changed from the base of the PR and between fdec51a and 461eb7b.

📒 Files selected for processing (6)

docs/src/examples/cytodataframe_at_a_glance.ipynb
docs/src/examples/cytodataframe_at_a_glance.py
src/cytodataframe/frame.py
src/cytodataframe/volume.py
tests/test_frame.py
tests/test_volume.py

src/cytodataframe/frame.py

Co-Authored-By: Jenna Tomkinson <107513215+jenna-tomkinson@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

src/cytodataframe/frame.py (1)
2270-2286: ⚠️ Potential issue | 🟠 Major

Reuse the same candidate-path resolution as _get_3d_volume_from_cell().

This lookup still only handles the raw value plus data_context_dir, so 3D cells resolved via file: URIs, Image_PathName_*, or data_image_paths can load the volume but fail to find the matching label overlay. Please route both code paths through the same path-resolution helper before calling _find_matching_segmentation_path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cytodataframe/frame.py` around lines 2270 - 2286, The current
candidate_path resolution duplicates logic and misses the same URI/name lookups
used by _get_3d_volume_from_cell(), causing label overlays to be missed; replace
the manual normalization block that builds candidate_path (the code using
data_value, data_context_dir and rglob) with a call to the same path-resolution
helper used by _get_3d_volume_from_cell() so that file:, Image_PathName_* and
data_image_paths are all resolved the same way, then pass the resulting
candidate_path into _find_matching_segmentation_path (and any subsequent logic)
instead of the locally computed path.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cytodataframe/frame.py`:
- Around line 1361-1367: The current matching_files comprehension uses substring
checks (any(idf in file.stem ...)) causing false positives; change it to
token/exact matching by normalizing or tokenizing the file stem and matching
whole tokens or using a word-boundary regex: e.g., split file.stem on
non-alphanumeric delimiters (or normalize both idf and stem) and test idf ∈
tokens, or use re.search with word boundaries around re.escape(idf) against
file.stem; update the condition in matching_files (where identifiers, file.stem,
file_pattern, search_root.rglob are used) to perform tokenized or whole-word
regex matching instead of substring membership.

---

Duplicate comments:
In `@src/cytodataframe/frame.py`:
- Around line 2270-2286: The current candidate_path resolution duplicates logic
and misses the same URI/name lookups used by _get_3d_volume_from_cell(), causing
label overlays to be missed; replace the manual normalization block that builds
candidate_path (the code using data_value, data_context_dir and rglob) with a
call to the same path-resolution helper used by _get_3d_volume_from_cell() so
that file:, Image_PathName_* and data_image_paths are all resolved the same way,
then pass the resulting candidate_path into _find_matching_segmentation_path
(and any subsequent logic) instead of the locally computed path.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 94686538-ddc3-4abd-a4a9-d7c13cec753d

📥 Commits

Reviewing files that changed from the base of the PR and between 461eb7b and ea8586c.

📒 Files selected for processing (2)

src/cytodataframe/frame.py
tests/test_frame.py

src/cytodataframe/frame.py

jenna-tomkinson

LGTM! I saw some HTML in here that I am not as familiar with, so you might consider having someone else give those small sections a review. Either way, cool work and I look forward to seeing your responses to some of my questions!

jenna-tomkinson · 2026-03-10T22:51:05Z

docs/src/examples/cytodataframe_at_a_glance.py

+    data_mask_context_dir=str(pathlib.Path(cp_3d_path) / "output/masks"),
+)
+
+cdf[["ImageNumber", "ObjectNumber", "FileName_Nuclei"]][:3]


Did this dataset not have the Image_ prefix for the FileName and PathName columns? This would be interesting as I am assuming this dataset would come from CellProfiler. This question relates to the meeting discussion and handling columns.

jenna-tomkinson · 2026-03-10T22:53:19Z

src/cytodataframe/frame.py

+            pattern_map: Optional regex mapping from segmentation filename
+                patterns to source image patterns.
+            file_dir: Root directory containing mask/outline files.
+            candidate_path: Resolved image candidate path used for stem matching.


What is the candidate path in this case? I tend to always us the regex method to resolve matching the cell to the mask, so I am curious how people would use it. Might consider also including examples in the docstring.

jenna-tomkinson · 2026-03-10T22:56:33Z

src/cytodataframe/frame.py

+                segmentation_path = self._find_matching_segmentation_path(
+                    data_value=data_value,
+                    pattern_map=pattern_map,
+                    file_dir=self._custom_attrs.get("data_mask_context_dir"),
+                    candidate_path=candidate_path,
+                )
+                if segmentation_path is None:
+                    segmentation_path = self._find_matching_segmentation_path(
+                        data_value=data_value,
+                        pattern_map=pattern_map,
+                        file_dir=self._custom_attrs.get("data_outline_context_dir"),
+                        candidate_path=candidate_path,
+                    )


This section was a bit confusing at first to me, they almost looked the same lol Is there a way to enforce DNR here?

jenna-tomkinson · 2026-03-10T22:59:23Z

src/cytodataframe/frame.py

+        x_min_col = _find_col("Minimum_X")
+        x_max_col = _find_col("Maximum_X")
+        y_min_col = _find_col("Minimum_Y")
+        y_max_col = _find_col("Maximum_Y")
+        z_min_col = _find_col("Minimum_Z")
+        z_max_col = _find_col("Maximum_Z")


Is this following CellProfiler convention? Do we expect any need for flexibility here?

jenna-tomkinson · 2026-03-10T23:02:17Z

src/cytodataframe/frame.py

+        table_height = _css_size(
+            kwargs.pop("table_height", kwargs.pop("table_max_height", "700px")),
+            "700px",
+        )
+
        grid = widgets.GridspecLayout(


How was the max height determined here?

jenna-tomkinson · 2026-03-10T23:02:43Z

src/cytodataframe/frame.py

-                width="auto",
+                width="100%",


Why did the default change? Was it better for visualization?

jenna-tomkinson · 2026-03-10T23:03:55Z

src/cytodataframe/volume.py

+    label_b64 = ""
+    label_color_attr = "0,1,0"
+    label_opacity_attr = "0.95"


Recommend adding code comments here to make it clear what these are. (like is the color set for green by default?)

jenna-tomkinson · 2026-03-10T23:04:57Z

src/cytodataframe/volume.py

+        "if(labelRawB64){"
+        "const labelRaw=atob(labelRawB64);"
+        "if(labelRaw.length===bytes.length){"
+        "const labelBytes=new Uint8Array(labelRaw.length);"
+        "for(let i=0;i<labelRaw.length;i+=1){labelBytes[i]=labelRaw.charCodeAt(i);}"
+        "const labelData=vtk.Common.DataModel.vtkImageData.newInstance();"
+        "labelData.setDimensions(dims);"


I am a bit lost here. Is this not Python? Might recommend having someone else look at this if it is HTML or something I am less familar with reviewing.

jenna-tomkinson · 2026-03-10T23:05:43Z

tests/data/CP_tutorial_3D_noise_nuclei_segmentation/3d-nuclei-profiling.cppipe

    Calculate the advanced features?:Yes

-RescaleIntensity:[module_num:12|svn_version:'Unknown'|variable_revision_number:3|show_window:True|notes:[]|batch_state:array([], dtype=uint8)|enabled:True|wants_pause:False]
+RescaleIntensity:[module_num:12|svn_version:'Unknown'|variable_revision_number:3|show_window:True|notes:[]|batch_state:array([], dtype=uint8)|enabled:False|wants_pause:False]


I see some modules have been turned off. Could you share the reasoning for turning them off when producing the features and masks?

jenna-tomkinson · 2026-03-10T23:06:07Z

tests/data/CP_tutorial_3D_noise_nuclei_segmentation/run.sh

+# Ensure dedicated output folder for segmentation masks exists.
+mkdir -p "$CPDOCKER_RUNDIR/output/masks"


great call!

d33bs added 2 commits March 8, 2026 13:35

add 3d segmentation masks for test dataset

bad49c4

add 3d mask visualization capabilities

ff5bc7c

[pre-commit.ci lite] apply automatic fixes

fdec51a

d33bs marked this pull request as ready for review March 8, 2026 22:57

d33bs requested a review from jenna-tomkinson as a code owner March 8, 2026 22:57

d33bs requested a review from Copilot March 8, 2026 22:58

Copilot started reviewing on behalf of d33bs March 8, 2026 22:58 View session

coderabbitai bot reviewed Mar 8, 2026

View reviewed changes

docs/src/examples/cytodataframe_at_a_glance.py Show resolved Hide resolved

Copilot AI reviewed Mar 8, 2026

View reviewed changes

src/cytodataframe/frame.py Outdated Show resolved Hide resolved

src/cytodataframe/frame.py Outdated Show resolved Hide resolved

src/cytodataframe/volume.py Outdated Show resolved Hide resolved

src/cytodataframe/volume.py Outdated Show resolved Hide resolved

d33bs added 5 commits March 9, 2026 05:25

changes from coderabbit review

7aaf06e

Merge branch '3d-labels' of https://github.com/d33bs/CytoDataFrame in…

6322ec1

…to 3d-labels

changes based on copilot review

e63b797

linting

3aa6c80

add checkbox to toggle the mask when present

461eb7b

coderabbitai bot reviewed Mar 9, 2026

View reviewed changes

src/cytodataframe/frame.py Outdated Show resolved Hide resolved

src/cytodataframe/frame.py Outdated Show resolved Hide resolved

add mask text label

ea8586c

Co-Authored-By: Jenna Tomkinson <107513215+jenna-tomkinson@users.noreply.github.com>

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

src/cytodataframe/frame.py Show resolved Hide resolved

d33bs and others added 9 commits March 10, 2026 13:01

notebook update

63ec569

notebook update

7606429

[pre-commit.ci lite] apply automatic fixes

69f1873

fix scrolling within 2d windows

c0803dc

update notebook

0c497b8

linting

ddffd78

readd static snapshots for 2d

500c509

linting

f3835fc

remove empty cell from notebook

333637a

jenna-tomkinson approved these changes Mar 10, 2026

View reviewed changes

		# Ensure dedicated output folder for segmentation masks exists.
		mkdir -p "$CPDOCKER_RUNDIR/output/masks"

Conversation

d33bs commented Mar 8, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What kind of change(s) are included?

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jenna-tomkinson left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

d33bs commented Mar 8, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 8, 2026 •

edited

Loading