Skip to content

⚡️ Speed up function create_callback_chain by 897%#59

Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-create_callback_chain-mglk2zhg
Open

⚡️ Speed up function create_callback_chain by 897%#59
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-create_callback_chain-mglk2zhg

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 11, 2025

📄 897% (8.97x) speedup for create_callback_chain in graphrag/index/run/utils.py

⏱️ Runtime : 266 microseconds 26.7 microseconds (best of 102 runs)

📝 Explanation and details

The optimization achieves a 896% speedup through two key changes:

1. Added __slots__ = ("_callbacks",) to WorkflowCallbacksManager
This eliminates the instance __dict__ overhead, reducing memory usage and improving attribute access speed for the _callbacks attribute.

2. Replaced loop-based registration with bulk extension
The original code used a loop to call register() for each callback:

for callback in callbacks or []:
    manager.register(callback)  # 6,141 calls to append()

The optimized version directly extends the internal list when callbacks exist:

if callbacks:
    manager._callbacks.extend(callbacks)  # Single bulk operation

Why this is faster:

  • Eliminates method call overhead: Instead of 6,141+ calls to register()append(), there's just one extend() call
  • Reduces loop iterations: The original profiler shows the loop (for callback in callbacks or []) consumed 7.8% of runtime across 6,181 iterations
  • Bulk list operations: extend() is optimized at the C level for adding multiple items efficiently

Performance impact by test case:

  • Large-scale tests see massive gains: Tests with 1000+ callbacks show 1746-2938% speedup
  • Small-scale tests see modest gains: Basic tests with few callbacks show 6-55% speedup
  • Edge cases benefit consistently: All test scenarios show improvement, with the optimization being most effective when processing many callbacks at once

The optimization is particularly effective for workflows that register many callbacks simultaneously, which appears to be the common use case based on the test results.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 24 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import Any

# imports
import pytest  # used for our unit tests
from graphrag.index.run.utils import create_callback_chain

# function to test
# Copyright (c) 2024 Microsoft Corporation.
# Licensed under the MIT License

# Dummy base class for testing
class WorkflowCallbacks:
    """Base class for workflow callbacks."""
    def __init__(self, name: str = ""):
        self.name = name
        self.events = []

    def on_event(self, event: Any):
        """Dummy event handler."""
        self.events.append(event)
from graphrag.index.run.utils import create_callback_chain

# unit tests

# ----------- Basic Test Cases -----------

def test_empty_callbacks_list_returns_manager_with_no_callbacks():
    # Test with empty list
    codeflash_output = create_callback_chain([]); manager = codeflash_output # 988ns -> 898ns (10.0% faster)

def test_none_callbacks_returns_manager_with_no_callbacks():
    # Test with None
    codeflash_output = create_callback_chain(None); manager = codeflash_output # 779ns -> 733ns (6.28% faster)








def test_manager_isolation_between_calls():
    # Ensure two calls to create_callback_chain are isolated
    cb1 = WorkflowCallbacks(name="cb1")
    cb2 = WorkflowCallbacks(name="cb2")
    codeflash_output = create_callback_chain([cb1]); manager1 = codeflash_output # 1.58μs -> 1.17μs (35.3% faster)
    codeflash_output = create_callback_chain([cb2]); manager2 = codeflash_output # 428ns -> 331ns (29.3% faster)







def test_manager_returns_workflowcallbackmanager_type():
    # Ensure the returned object is always WorkflowCallbacksManager
    cb = WorkflowCallbacks(name="cb")
    codeflash_output = create_callback_chain([cb]); manager = codeflash_output # 1.50μs -> 1.05μs (42.6% faster)

def test_manager_with_no_callbacks_still_has_manager_methods():
    # Ensure manager with no callbacks still has expected methods
    codeflash_output = create_callback_chain([]); manager = codeflash_output # 827ns -> 727ns (13.8% faster)


#------------------------------------------------
import pytest  # used for our unit tests
from graphrag.callbacks.workflow_callbacks import WorkflowCallbacks
from graphrag.callbacks.workflow_callbacks_manager import \
    WorkflowCallbacksManager
from graphrag.index.run.utils import create_callback_chain

# unit tests

class DummyCallback(WorkflowCallbacks):
    """A dummy WorkflowCallbacks implementation for testing."""
    def __init__(self, name):
        self.name = name

def get_callback_names(manager: WorkflowCallbacksManager):
    """Helper to extract names from a WorkflowCallbacksManager's _callbacks list."""
    return [cb.name for cb in manager._callbacks]

# ------------------- BASIC TEST CASES -------------------

def test_empty_list_returns_manager_with_no_callbacks():
    # Test with empty list
    codeflash_output = create_callback_chain([]); manager = codeflash_output # 1.12μs -> 946ns (18.8% faster)

def test_none_returns_manager_with_no_callbacks():
    # Test with None input
    codeflash_output = create_callback_chain(None); manager = codeflash_output # 831ns -> 699ns (18.9% faster)


def test_multiple_callbacks_returns_manager_with_all_callbacks():
    # Test with multiple callbacks
    cb1 = DummyCallback("cb1")
    cb2 = DummyCallback("cb2")
    cb3 = DummyCallback("cb3")
    codeflash_output = create_callback_chain([cb1, cb2, cb3]); manager = codeflash_output # 1.18μs -> 815ns (45.0% faster)

def test_callbacks_list_is_not_copied_but_referenced():
    # Test that the manager keeps its own list, not the input list
    cb1 = DummyCallback("cb1")
    cb2 = DummyCallback("cb2")
    input_list = [cb1, cb2]
    codeflash_output = create_callback_chain(input_list); manager = codeflash_output # 1.19μs -> 876ns (36.0% faster)
    input_list.append(DummyCallback("cb3"))

# ------------------- EDGE TEST CASES -------------------

def test_callbacks_list_with_duplicates():
    # Test with duplicate callbacks in the list
    cb1 = DummyCallback("cb1")
    cb2 = DummyCallback("cb2")
    codeflash_output = create_callback_chain([cb1, cb2, cb1, cb2]); manager = codeflash_output # 1.27μs -> 817ns (55.4% faster)

def test_callbacks_list_with_different_types():
    # Test with different WorkflowCallbacks subclasses
    class AnotherDummyCallback(WorkflowCallbacks):
        def __init__(self, name):
            self.name = name
    cb1 = DummyCallback("cb1")
    cb2 = AnotherDummyCallback("cb2")
    codeflash_output = create_callback_chain([cb1, cb2]); manager = codeflash_output # 1.09μs -> 828ns (31.6% faster)

def test_callbacks_list_with_non_workflow_callbacks_raises():
    # Test with an object that is not a WorkflowCallbacks instance
    class NotACallback:
        pass
    nac = NotACallback()
    with pytest.raises(AttributeError):
        # This will fail when trying to access nac.name in get_callback_names
        codeflash_output = create_callback_chain([nac]); manager = codeflash_output # 1.06μs -> 859ns (23.1% faster)
        get_callback_names(manager)

def test_callbacks_list_with_none_element():
    # Test with None as an element in the callbacks list
    cb1 = DummyCallback("cb1")
    with pytest.raises(AttributeError):
        codeflash_output = create_callback_chain([cb1, None]); manager = codeflash_output # 1.15μs -> 846ns (36.3% faster)
        get_callback_names(manager)

def test_callbacks_list_is_tuple():
    # Test with tuple input instead of list
    cb1 = DummyCallback("cb1")
    cb2 = DummyCallback("cb2")
    codeflash_output = create_callback_chain((cb1, cb2)); manager = codeflash_output # 1.12μs -> 812ns (38.2% faster)


def test_manager_isolation_between_calls():
    # Ensure that two managers do not share state
    cb1 = DummyCallback("cb1")
    cb2 = DummyCallback("cb2")
    codeflash_output = create_callback_chain([cb1]); manager1 = codeflash_output # 971ns -> 792ns (22.6% faster)
    codeflash_output = create_callback_chain([cb2]); manager2 = codeflash_output # 465ns -> 397ns (17.1% faster)

# ------------------- LARGE SCALE TEST CASES -------------------


def test_large_number_of_duplicate_callbacks():
    # Test with a large number of duplicate callbacks
    cb = DummyCallback("cbX")
    codeflash_output = create_callback_chain([cb] * 1000); manager = codeflash_output # 59.1μs -> 1.95μs (2938% faster)

def test_large_number_of_unique_types():
    # Test with many unique callback types
    callbacks = []
    for i in range(1000):
        # Dynamically create a new type for each callback
        callbacks.append(type(f"Dummy{i}", (WorkflowCallbacks,), {"name": f"cb{i}"})())
    codeflash_output = create_callback_chain(callbacks); manager = codeflash_output # 63.8μs -> 3.46μs (1746% faster)

def test_performance_large_scale(monkeypatch):
    # Test that function completes in reasonable time for large input
    import time
    callbacks = [DummyCallback(f"cb{i}") for i in range(1000)]
    start = time.time()
    codeflash_output = create_callback_chain(callbacks); manager = codeflash_output # 60.7μs -> 2.35μs (2480% faster)
    elapsed = time.time() - start

def test_manager_is_new_instance_each_time():
    # Ensure each call to create_callback_chain returns a new manager instance
    cb1 = DummyCallback("cb1")
    cb2 = DummyCallback("cb2")
    codeflash_output = create_callback_chain([cb1]); manager1 = codeflash_output # 1.13μs -> 912ns (23.9% faster)
    codeflash_output = create_callback_chain([cb2]); manager2 = codeflash_output # 420ns -> 320ns (31.2% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-create_callback_chain-mglk2zhg and push.

Codeflash

The optimization achieves a **896% speedup** through two key changes:

**1. Added `__slots__ = ("_callbacks",)` to WorkflowCallbacksManager**
This eliminates the instance `__dict__` overhead, reducing memory usage and improving attribute access speed for the `_callbacks` attribute.

**2. Replaced loop-based registration with bulk extension**
The original code used a loop to call `register()` for each callback:
```python
for callback in callbacks or []:
    manager.register(callback)  # 6,141 calls to append()
```

The optimized version directly extends the internal list when callbacks exist:
```python
if callbacks:
    manager._callbacks.extend(callbacks)  # Single bulk operation
```

**Why this is faster:**
- **Eliminates method call overhead**: Instead of 6,141+ calls to `register()` → `append()`, there's just one `extend()` call
- **Reduces loop iterations**: The original profiler shows the loop (`for callback in callbacks or []`) consumed 7.8% of runtime across 6,181 iterations
- **Bulk list operations**: `extend()` is optimized at the C level for adding multiple items efficiently

**Performance impact by test case:**
- **Large-scale tests see massive gains**: Tests with 1000+ callbacks show 1746-2938% speedup
- **Small-scale tests see modest gains**: Basic tests with few callbacks show 6-55% speedup
- **Edge cases benefit consistently**: All test scenarios show improvement, with the optimization being most effective when processing many callbacks at once

The optimization is particularly effective for workflows that register many callbacks simultaneously, which appears to be the common use case based on the test results.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 11, 2025 00:47
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants