Skip to content

⚡️ Speed up function get_zero_positions by 15%#61

Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-get_zero_positions-mglmz8ns
Open

⚡️ Speed up function get_zero_positions by 15%#61
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-get_zero_positions-mglmz8ns

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 11, 2025

📄 15% (0.15x) speedup for get_zero_positions in graphrag/index/operations/layout_graph/zero.py

⏱️ Runtime : 4.44 milliseconds 3.87 milliseconds (best of 315 runs)

📝 Explanation and details

The optimization achieves a 14% speedup by eliminating redundant operations within the main loop and moving preprocessing outside of it.

Key optimizations:

  1. Pre-compute category and size values: Instead of checking if node_categories is None and if node_sizes is None inside the loop for every node, the optimized version pre-processes these values once at the beginning. This eliminates 18,000+ conditional checks in the original profiler results.

  2. Batch string conversions: Category values are converted to strings once using list comprehensions ([str(int(cat)) for cat in node_categories]) rather than calling str(int(node_category)) for each node individually.

  3. List comprehension instead of append: The optimized version uses list comprehensions to build the result list directly, which is more efficient than repeatedly calling append() on an initially empty list.

Performance impact by test case type:

  • Large-scale tests (999+ nodes): Show the best improvements (5-32% faster) because the preprocessing overhead is amortized across many nodes
  • Small-scale tests (1-3 nodes): Show slight regressions (12-36% slower) due to the upfront preprocessing cost not being offset by the reduced per-node work
  • Edge cases with None values: Benefit significantly as the None checks are handled once instead of per-iteration

The line profiler confirms this: the original code spent 29.7% of time in NodePosition() constructor calls within the loop, while the optimized version reduces this overhead through better data preparation and more efficient list construction patterns.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 45 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 2 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from dataclasses import dataclass

# imports
import pytest  # used for our unit tests
from graphrag.index.operations.layout_graph.zero import get_zero_positions

# function to test
# Copyright (c) 2024 Microsoft Corporation.
# Licensed under the MIT License

# Define NodePosition for tests (as in the original context)
@dataclass
class NodePosition:
    label: str
    x: int
    y: int
    cluster: str
    size: int
    z: int = None  # Optional, only for 3D
from graphrag.index.operations.layout_graph.zero import get_zero_positions

# unit tests

# ----------- BASIC TEST CASES -----------

def test_basic_single_node_2d():
    # Single node, default categories and sizes, 2D
    codeflash_output = get_zero_positions(["A"]); result = codeflash_output # 2.51μs -> 3.19μs (21.2% slower)
    pos = result[0]

def test_basic_multiple_nodes_2d():
    # Multiple nodes, default categories and sizes, 2D
    codeflash_output = get_zero_positions(["A", "B", "C"]); result = codeflash_output # 3.83μs -> 4.41μs (13.1% slower)
    for i, label in enumerate(["A", "B", "C"]):
        pos = result[i]

def test_basic_single_node_3d():
    # Single node, default categories and sizes, 3D
    codeflash_output = get_zero_positions(["A"], three_d=True); result = codeflash_output # 2.78μs -> 3.75μs (25.9% slower)
    pos = result[0]

def test_basic_multiple_nodes_with_categories_and_sizes_2d():
    # Multiple nodes, explicit categories and sizes, 2D
    labels = ["A", "B", "C"]
    categories = [5, 6, 7]
    sizes = [10, 20, 30]
    codeflash_output = get_zero_positions(labels, categories, sizes); result = codeflash_output # 4.03μs -> 5.43μs (25.8% slower)
    for i in range(3):
        pos = result[i]

def test_basic_multiple_nodes_with_categories_and_sizes_3d():
    # Multiple nodes, explicit categories and sizes, 3D
    labels = ["A", "B", "C"]
    categories = [5, 6, 7]
    sizes = [10, 20, 30]
    codeflash_output = get_zero_positions(labels, categories, sizes, three_d=True); result = codeflash_output # 4.34μs -> 5.80μs (25.1% slower)
    for i in range(3):
        pos = result[i]

# ----------- EDGE TEST CASES -----------

def test_empty_labels():
    # No nodes at all
    codeflash_output = get_zero_positions([]); result = codeflash_output # 740ns -> 1.82μs (59.3% slower)

def test_none_categories_and_sizes():
    # Explicitly pass None for categories and sizes
    labels = ["A", "B"]
    codeflash_output = get_zero_positions(labels, None, None); result = codeflash_output # 3.39μs -> 4.16μs (18.4% slower)
    for pos in result:
        pass

def test_single_node_custom_category_and_size():
    # Single node, custom category and size
    codeflash_output = get_zero_positions(["X"], [42], [99]); result = codeflash_output # 2.47μs -> 3.85μs (36.0% slower)
    pos = result[0]

def test_category_and_size_are_zero():
    # Category and size are zero
    codeflash_output = get_zero_positions(["Zero"], [0], [0]); result = codeflash_output # 2.44μs -> 3.68μs (33.7% slower)
    pos = result[0]

def test_negative_category_and_size():
    # Negative values for category and size
    codeflash_output = get_zero_positions(["Neg"], [-3], [-7]); result = codeflash_output # 2.48μs -> 3.64μs (31.8% slower)
    pos = result[0]

def test_mismatched_lengths_raises():
    # Mismatched input lengths should raise IndexError
    with pytest.raises(IndexError):
        get_zero_positions(["A", "B"], [1], [2, 3]) # 2.98μs -> 4.39μs (32.2% slower)

def test_non_string_labels():
    # Labels are not strings (should be cast to string)
    codeflash_output = get_zero_positions([123, None, True]); result = codeflash_output # 4.17μs -> 5.15μs (19.0% slower)

def test_three_d_flag_none():
    # three_d=None should act as False (2D)
    codeflash_output = get_zero_positions(["A"], three_d=None); result = codeflash_output # 2.64μs -> 3.50μs (24.6% slower)
    pos = result[0]

def test_three_d_flag_true_and_none_categories_sizes():
    # 3D, but categories and sizes are None
    codeflash_output = get_zero_positions(["A", "B"], None, None, three_d=True); result = codeflash_output # 3.79μs -> 4.33μs (12.6% slower)
    for pos in result:
        pass

def test_large_integer_category_and_size():
    # Very large integers for category and size
    big = 2**60
    codeflash_output = get_zero_positions(["Big"], [big], [big]); result = codeflash_output # 2.70μs -> 4.09μs (33.9% slower)
    pos = result[0]

# ----------- LARGE SCALE TEST CASES -----------

def test_large_number_of_nodes_2d():
    # 1000 nodes, default categories and sizes, 2D
    labels = [f"n{i}" for i in range(1000)]
    codeflash_output = get_zero_positions(labels); result = codeflash_output # 480μs -> 365μs (31.5% faster)
    for i in range(0, 1000, 100):  # Check every 100th for efficiency
        pos = result[i]

def test_large_number_of_nodes_3d_with_categories_and_sizes():
    # 1000 nodes, explicit categories and sizes, 3D
    labels = [f"n{i}" for i in range(1000)]
    categories = [i % 10 for i in range(1000)]
    sizes = [i for i in range(1000)]
    codeflash_output = get_zero_positions(labels, categories, sizes, three_d=True); result = codeflash_output # 496μs -> 468μs (5.91% faster)
    # Test a few random indices
    for i in [0, 123, 456, 999]:
        pos = result[i]

def test_large_number_of_nodes_with_non_string_labels():
    # 1000 nodes, labels are integers
    labels = list(range(1000))
    codeflash_output = get_zero_positions(labels); result = codeflash_output # 477μs -> 373μs (27.7% faster)
    for i in [0, 100, 500, 999]:
        pos = result[i]

def test_large_number_of_nodes_edge_categories_sizes():
    # 1000 nodes, categories and sizes are all negative
    labels = [f"n{i}" for i in range(1000)]
    categories = [-i for i in range(1000)]
    sizes = [-i for i in range(1000)]
    codeflash_output = get_zero_positions(labels, categories, sizes); result = codeflash_output # 478μs -> 437μs (9.36% faster)
    for i in [0, 250, 999]:
        pos = result[i]

def test_large_number_of_nodes_with_mixed_types():
    # 1000 nodes, labels are mixed types
    labels = [i if i % 2 == 0 else f"str{i}" for i in range(1000)]
    codeflash_output = get_zero_positions(labels); result = codeflash_output # 476μs -> 372μs (27.9% faster)
    for i in [0, 1, 998, 999]:
        expected_label = str(labels[i])
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from dataclasses import dataclass
from typing import Optional

# imports
import pytest  # used for our unit tests
from graphrag.index.operations.layout_graph.zero import get_zero_positions

# function to test
# Copyright (c) 2024 Microsoft Corporation.
# Licensed under the MIT License


@dataclass
class NodePosition:
    label: str
    x: int
    y: int
    cluster: str
    size: int
    z: Optional[int] = None  # Only present if 3D
from graphrag.index.operations.layout_graph.zero import get_zero_positions

# unit tests

# ---------------------- BASIC TEST CASES ----------------------

def test_basic_single_node_default():
    # Single node, default params (2D)
    codeflash_output = get_zero_positions(["A"]); result = codeflash_output # 3.06μs -> 4.35μs (29.6% slower)
    pos = result[0]

def test_basic_multiple_nodes_default():
    # Multiple nodes, default params (2D)
    codeflash_output = get_zero_positions(["A", "B", "C"]); result = codeflash_output # 4.10μs -> 4.70μs (12.7% slower)
    for i, label in enumerate(["A", "B", "C"]):
        pos = result[i]

def test_basic_with_categories_and_sizes():
    # Multiple nodes, with categories and sizes
    labels = ["A", "B", "C"]
    cats = [3, 7, 2]
    sizes = [10, 20, 30]
    codeflash_output = get_zero_positions(labels, node_categories=cats, node_sizes=sizes); result = codeflash_output # 4.36μs -> 5.88μs (25.8% slower)
    for i in range(3):
        pos = result[i]

def test_basic_3d_default():
    # Single node, 3D output
    codeflash_output = get_zero_positions(["A"], three_d=True); result = codeflash_output # 2.89μs -> 4.02μs (28.0% slower)
    pos = result[0]

def test_basic_3d_with_categories_and_sizes():
    # Multiple nodes, 3D, with categories and sizes
    labels = ["X", "Y"]
    cats = [5, 6]
    sizes = [11, 22]
    codeflash_output = get_zero_positions(labels, node_categories=cats, node_sizes=sizes, three_d=True); result = codeflash_output # 3.88μs -> 5.33μs (27.1% slower)
    for i in range(2):
        pos = result[i]

# ---------------------- EDGE TEST CASES ----------------------

def test_empty_node_labels():
    # Empty node_labels should return an empty list
    codeflash_output = get_zero_positions([]) # 783ns -> 1.85μs (57.8% slower)

def test_empty_node_labels_with_other_params():
    # Empty node_labels, but non-empty categories/sizes (should still return empty)
    codeflash_output = get_zero_positions([], node_categories=[1,2], node_sizes=[3,4]) # 947ns -> 2.89μs (67.3% slower)

def test_node_labels_with_empty_categories_and_sizes():
    # node_labels non-empty, but empty categories/sizes (should raise IndexError)
    with pytest.raises(IndexError):
        get_zero_positions(["A", "B"], node_categories=[], node_sizes=[]) # 1.49μs -> 2.94μs (49.4% slower)

def test_node_labels_with_partial_categories():
    # node_labels longer than categories (should raise IndexError)
    with pytest.raises(IndexError):
        get_zero_positions(["A", "B", "C"], node_categories=[1,2]) # 4.47μs -> 5.90μs (24.2% slower)

def test_node_labels_with_partial_sizes():
    # node_labels longer than sizes (should raise IndexError)
    with pytest.raises(IndexError):
        get_zero_positions(["A", "B", "C"], node_sizes=[1,2]) # 4.10μs -> 5.27μs (22.1% slower)

def test_node_labels_with_none_and_empty_categories():
    # node_categories explicitly None, node_sizes empty (should raise IndexError)
    with pytest.raises(IndexError):
        get_zero_positions(["A"], node_categories=None, node_sizes=[]) # 1.49μs -> 2.92μs (49.1% slower)

def test_non_integer_categories_and_sizes():
    # node_categories and node_sizes as strings that can be cast to int
    labels = ["A"]
    cats = ["2"]
    sizes = ["5"]
    codeflash_output = get_zero_positions(labels, node_categories=cats, node_sizes=sizes); result = codeflash_output # 3.19μs -> 4.54μs (29.7% slower)

def test_non_integer_categories_and_sizes_fail():
    # node_categories and node_sizes as strings that cannot be cast to int
    labels = ["A"]
    cats = ["foo"]
    sizes = ["bar"]
    with pytest.raises(ValueError):
        get_zero_positions(labels, node_categories=cats, node_sizes=sizes) # 3.85μs -> 3.69μs (4.34% faster)

def test_node_labels_with_none_and_none_sizes():
    # node_categories and node_sizes both None, should default to 1
    codeflash_output = get_zero_positions(["A", "B"], node_categories=None, node_sizes=None); result = codeflash_output # 4.09μs -> 4.98μs (17.9% slower)

def test_node_labels_with_negative_sizes():
    # Negative sizes should be accepted and cast to int
    codeflash_output = get_zero_positions(["A"], node_sizes=[-5]); result = codeflash_output # 2.77μs -> 4.11μs (32.5% slower)

def test_node_labels_with_negative_categories():
    # Negative categories should be accepted and cast to str
    codeflash_output = get_zero_positions(["A"], node_categories=[-10]); result = codeflash_output # 2.80μs -> 4.12μs (32.2% slower)

def test_node_labels_with_boolean_categories_and_sizes():
    # Boolean values should be cast to int (True=1, False=0)
    codeflash_output = get_zero_positions(["A", "B"], node_categories=[True, False], node_sizes=[False, True]); result = codeflash_output # 3.73μs -> 5.19μs (28.1% slower)

def test_node_labels_with_non_str_labels():
    # Labels that are not strings (should be cast to str)
    codeflash_output = get_zero_positions([10, None, 3.5]); result = codeflash_output # 5.60μs -> 6.22μs (10.00% slower)

def test_3d_with_missing_z_attribute():
    # 3D mode: z attribute should always be present and 0
    codeflash_output = get_zero_positions(["A", "B"], three_d=True); result = codeflash_output # 3.82μs -> 4.62μs (17.2% slower)
    for pos in result:
        pass

def test_2d_with_no_z_attribute():
    # 2D mode: z attribute should not be present or should be None
    codeflash_output = get_zero_positions(["A"], three_d=False); result = codeflash_output # 2.58μs -> 3.51μs (26.5% slower)
    pos = result[0]

# ---------------------- LARGE SCALE TEST CASES ----------------------

def test_large_number_of_nodes_2d():
    # Large input (999 nodes), 2D
    n = 999
    labels = [f"node_{i}" for i in range(n)]
    cats = [i % 5 for i in range(n)]
    sizes = [i+1 for i in range(n)]
    codeflash_output = get_zero_positions(labels, node_categories=cats, node_sizes=sizes); result = codeflash_output # 481μs -> 450μs (6.88% faster)
    for i in (0, n//2, n-1):  # spot check a few
        pos = result[i]

def test_large_number_of_nodes_3d():
    # Large input (999 nodes), 3D
    n = 999
    labels = [f"node_{i}" for i in range(n)]
    cats = [i % 3 for i in range(n)]
    sizes = [2 for _ in range(n)]
    codeflash_output = get_zero_positions(labels, node_categories=cats, node_sizes=sizes, three_d=True); result = codeflash_output # 489μs -> 450μs (8.69% faster)
    for i in (0, n//2, n-1):  # spot check a few
        pos = result[i]

def test_large_scale_default_params():
    # Large input, default categories and sizes
    n = 999
    labels = [str(i) for i in range(n)]
    codeflash_output = get_zero_positions(labels); result = codeflash_output # 455μs -> 346μs (31.6% faster)
    for i in (0, n//2, n-1):
        pos = result[i]

def test_large_scale_edge_case_empty_lists():
    # Large but empty lists for categories/sizes (should raise IndexError)
    with pytest.raises(IndexError):
        get_zero_positions(["A"] * 999, node_categories=[], node_sizes=[]) # 1.61μs -> 3.11μs (48.1% slower)

def test_large_scale_boolean_categories_and_sizes():
    # Large input, boolean categories and sizes
    n = 999
    labels = [f"n{i}" for i in range(n)]
    cats = [i % 2 == 0 for i in range(n)]
    sizes = [i % 2 == 1 for i in range(n)]
    codeflash_output = get_zero_positions(labels, node_categories=cats, node_sizes=sizes); result = codeflash_output # 486μs -> 445μs (9.31% faster)
    for i in (0, n//2, n-1):
        pos = result[i]
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from graphrag.index.operations.layout_graph.zero import get_zero_positions

def test_get_zero_positions():
    get_zero_positions([''], node_categories=[-10], node_sizes=[0], three_d=True)

def test_get_zero_positions_2():
    get_zero_positions([''], node_categories=None, node_sizes=None, three_d=None)
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_3eu3lmds/tmp1ijlmd32/test_concolic_coverage.py::test_get_zero_positions 3.33μs 4.78μs -30.2%⚠️
codeflash_concolic_3eu3lmds/tmp1ijlmd32/test_concolic_coverage.py::test_get_zero_positions_2 2.67μs 3.96μs -32.6%⚠️

To edit these changes git checkout codeflash/optimize-get_zero_positions-mglmz8ns and push.

Codeflash

The optimization achieves a **14% speedup** by eliminating redundant operations within the main loop and moving preprocessing outside of it.

**Key optimizations:**

1. **Pre-compute category and size values**: Instead of checking `if node_categories is None` and `if node_sizes is None` inside the loop for every node, the optimized version pre-processes these values once at the beginning. This eliminates 18,000+ conditional checks in the original profiler results.

2. **Batch string conversions**: Category values are converted to strings once using list comprehensions (`[str(int(cat)) for cat in node_categories]`) rather than calling `str(int(node_category))` for each node individually.

3. **List comprehension instead of append**: The optimized version uses list comprehensions to build the result list directly, which is more efficient than repeatedly calling `append()` on an initially empty list.

**Performance impact by test case type:**
- **Large-scale tests (999+ nodes)**: Show the best improvements (5-32% faster) because the preprocessing overhead is amortized across many nodes
- **Small-scale tests (1-3 nodes)**: Show slight regressions (12-36% slower) due to the upfront preprocessing cost not being offset by the reduced per-node work
- **Edge cases with None values**: Benefit significantly as the None checks are handled once instead of per-iteration

The line profiler confirms this: the original code spent 29.7% of time in `NodePosition()` constructor calls within the loop, while the optimized version reduces this overhead through better data preparation and more efficient list construction patterns.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 11, 2025 02:08
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants