Feature: Add Polars DataFrame support




**Depends on**: #1132 — can be one PR  
**Original request**: #1124

## Summary

Add support for `polars.DataFrame` and `polars.LazyFrame` as inputs to `edges()`, `nodes()`, `plot()`, `hypergraph()`, `materialize_nodes()`, and `get_degrees()`. Polars is optional — no behavior change if not installed.

## Current behavior

```python
import polars as pl
import graphistry

edges = pl.DataFrame({'src': ['a', 'b'], 'dst': ['b', 'c']})
graphistry.edges(edges, 'src', 'dst').plot()

```

## Requested behavior

```python
import polars as pl
import graphistry

edges = pl.DataFrame({'src': ['a', 'b', 'c'], 'dst': ['b', 'c', 'a'], 'weight': [1, 2, 3]})
nodes = pl.DataFrame({'id': ['a', 'b', 'c'], 'label': ['Alice', 'Bob', 'Carol']})

g = graphistry.edges(edges, 'src', 'dst').nodes(nodes, 'id')
g.plot()               # ✅
g.materialize_nodes()  # ✅
g.get_degrees()        # ✅
g.hypergraph(edges)    # ✅


graphistry.edges(edges.lazy(), 'src', 'dst').plot()  # ✅
```

## What's not in scope

The result of compute methods will be pandas-backed, not Polars. Convert back if needed:
```python
pl.from_pandas(g.materialize_nodes()._nodes)
```

`featurize()` and `umap()` are out of scope — separate issue.

## Workaround until fixed

```python
graphistry.edges(pl_df.to_pandas(), 'src', 'dst').plot()
graphistry.edges(pl_lazy.collect().to_pandas(), 'src', 'dst').plot()
```

## Design

Polars is an *input format*, not a compute engine. Like Arrow and Spark, it gets converted at path boundaries — it doesn't stay native throughout. There are two conversion paths, each needing a Polars branch:

**Upload path** (`plot()`): Polars → Arrow via `.to_arrow()`. No pandas intermediate. Efficient and lossless. `LazyFrame` is materialized first via `.collect()`, the same way dask uses `.compute()`.

**Compute/hypergraph paths** (`materialize_nodes`, `hypergraph`, etc.): Polars → pandas via `.to_pandas()`. These paths operate on live pandas/cuDF engines. Once `_table_to_pandas()` handles Polars, the coerce-at-entry fixes from #1132 cover these paths automatically — no extra Polars-specific code needed there.

## Implementation

All changes are in `graphistry/PlotterBase.py` unless noted. Follow the existing `maybe_cudf()` / cudf branch pattern exactly.

**1. `maybe_polars()` lazy import** — add after `maybe_spark()` (~line 142):

```python
@lru_cache(maxsize=1)
def maybe_polars():
    try:
        import polars
        return polars
    except ImportError:
        pass
    except RuntimeError:
        logger.warning('Runtime error importing polars', exc_info=True)
    return None
```

**2. Memoization cache** — add with the other caches (~line 166):

```python
_polars_hash_to_arrow: WeakValueDictionary = WeakValueDictionary()
```

And clear it in `reset_caches()`.

**3. `_table_to_arrow()` — Polars branch** — add before the final `raise` (~line 2987):

```python
if not (maybe_polars() is None) and isinstance(table, (maybe_polars().DataFrame, maybe_polars().LazyFrame)):
    if isinstance(table, maybe_polars().LazyFrame):
        table = table.collect()
    hashed = None
    if memoize:
        hashed = (
            hashlib.sha256(table.hash_rows().to_numpy().tobytes()).hexdigest()
            + hashlib.sha256(str(table.columns).encode('utf-8')).hexdigest()
        )
        if hashed in PlotterBase._polars_hash_to_arrow:
            return PlotterBase._polars_hash_to_arrow[hashed].v
    out = table.to_arrow().replace_schema_metadata({})
    # strip schema metadata: Polars attaches polars-specific metadata that can cause
    # downstream issues, same reason the pandas branch calls replace_schema_metadata({})
    if memoize and hashed is not None:
        w = WeakValueWrapper(out)
        cache_coercion(hashed, w)
        PlotterBase._polars_hash_to_arrow[hashed] = w
    return out
```

**4. `_table_to_pandas()` — Polars branch** — add before the final `raise` (~line 2834):

```python
if not (maybe_polars() is None) and isinstance(table, (maybe_polars().DataFrame, maybe_polars().LazyFrame)):
    if isinstance(table, maybe_polars().LazyFrame):
        table = table.collect()
    return table.to_pandas()
```

**5. `_plot_dispatch()` type guard** — extend the isinstance chain (~line 2700):

```python
or ( not (maybe_polars() is None) and isinstance(graph, (maybe_polars().DataFrame, maybe_polars().LazyFrame)) )
```

**6. `graphistry/Engine.py`: `resolve_engine()`** — add explicit Polars → PANDAS before the fallthrough (~line 70), matching the Arrow fix from #1132:

```python
if not (maybe_polars() is None) and isinstance(g_or_df, (maybe_polars().DataFrame, maybe_polars().LazyFrame)):
    return Engine.PANDAS
```

(The coerce-at-entry in `materialize_nodes` and `hypergraph` from #1132 then handles the actual conversion.)

## Testing

New file `tests/test_polars.py` with `pytest.importorskip('polars')` at the top:

- `_table_to_arrow(pl.DataFrame(...))` → returns `pa.Table`, schema metadata is empty
- `_table_to_arrow(pl.DataFrame(...).lazy())` → same
- `_table_to_pandas(pl.DataFrame(...))` → returns `pd.DataFrame`
- `edges(pl.DataFrame(...)).nodes(pl.DataFrame(...)).plot()` → mock/skip upload, assert no error before upload step
- `edges(pl.DataFrame(...)).materialize_nodes()` → returns result with pandas `_nodes`
- `hypergraph(pl.DataFrame(...))` → returns valid `Hypergraph`
- Memoization: calling `_table_to_arrow` twice on the same frame returns the same object
- Full suite passes with polars not installed (existing tests unaffected)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Add Polars DataFrame support #1133

Summary

Current behavior

Requested behavior

What's not in scope

Workaround until fixed

Design

Implementation

Testing

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Add Polars DataFrame support #1133

Description

Summary

Current behavior

Requested behavior

What's not in scope

Workaround until fixed

Design

Implementation

Testing

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions