Skip to content

GFQLSchemaError filtering on boolean label__ columns #876

@lmeyerov

Description

@lmeyerov

Summary

Filtering on boolean label columns in a GFQL chain raises GFQLSchemaError claiming the label column is a string, even though the dataframe shows a bool dtype. This blocks label predicates on relationship endpoints (e.g., MATCH (:A)-[r]->(:B)).

Repro (minimal)

import pandas as pd
from graphistry.compute import n, e_forward
from graphistry.tests.test_compute import CGFull
from tests.cypher_tck.parse_cypher import graph_fixture_from_create

fixture = graph_fixture_from_create(
    """
    CREATE (:A)-[:T1]->(:B),
           (:B)-[:T2]->(:A),
           (:B)-[:T3]->(:B),
           (:A)-[:T4]->(:A)
    """
)

g = CGFull()

nodes_df = pd.DataFrame(fixture.nodes)
labels = nodes_df.get("labels", pd.Series([], dtype=object)).tolist()
all_labels = sorted({label for labs in labels for label in (labs or [])})
for label in all_labels:
    nodes_df[f"label__{label}"] = [label in (labs or []) for labs in labels]

edges_df = pd.DataFrame(fixture.edges)

g = g.nodes(nodes_df, "id").edges(edges_df, "src", "dst", edge="edge_id")

g.gfql([n({"label__A": True}), e_forward(), n({"label__B": True})], engine="pandas")

Actual

GFQLSchemaError: [incompatible-column-type] Type mismatch: column "label__B" is string but filter value is numeric

Note: nodes_df.dtypes shows label__B as bool before the call.

Expected

The boolean label columns should remain boolean through the chain, and filtering on label__B=True should succeed.

Environment

  • commit: 7b721c9
  • engine: pandas
  • branch: test/cypher-conformance

Context

This surfaced while translating openCypher TCK scenario Match2 [2]: MATCH (:A)-[r]->(:B) RETURN r.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions