Skip to content

join_spatialelement_table crashes when obs index name matches an existing column #1099

@timtreis

Description

@timtreis

Bug description

_inner_join_spatialelement_table and _left_join_spatialelement_table in spatialdata/_core/query/relational_query.py call table.obs.reset_index() (lines 390 and 471) without handling the case where the obs index name already exists as a column. This raises:

ValueError: cannot insert EntityID, already exists

How it manifests

In spatialdata-plot, render_shapes(color=...) calls join_spatialelement_table(..., how="inner"), which hits this crash. Users with Merfish data are affected because their tables have EntityID as both the obs index name and an obs column — a state that spatialdata's own validation allows.

Reported in scverse/spatialdata-plot#441.

Minimal reproduction

import pandas as pd
from anndata import AnnData
from spatialdata.models import TableModel

obs = pd.DataFrame({
    "region": pd.Categorical(["shapes"] * 5),
    "EntityID": [0, 1, 2, 3, 4],
    "cell_type": ["A", "B", "C", "A", "B"],
})
table = AnnData(obs=obs)
table = TableModel.parse(table, region="shapes", region_key="region", instance_key="EntityID")

# Simulate the state found in real Merfish data loaded from disk
table.obs.index = pd.Index([0, 1, 2, 3, 4], name="EntityID")

# This is what join_spatialelement_table does internally — crashes here:
table.obs.reset_index()
# ValueError: cannot insert EntityID, already exists

Suggested fix

In _inner_join_spatialelement_table (line 390) and _left_join_spatialelement_table (line 471), handle the collision before calling reset_index(). For example, drop the index name when it already exists as a column:

obs = table.obs
if obs.index.name is not None and obs.index.name in obs.columns:
    obs = obs.reset_index(drop=True)
else:
    obs = obs.reset_index()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions