⚡️ Speed up function create_callback_chain by 897%#59
Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
Open
⚡️ Speed up function create_callback_chain by 897%#59codeflash-ai[bot] wants to merge 1 commit intomainfrom
create_callback_chain by 897%#59codeflash-ai[bot] wants to merge 1 commit intomainfrom
Conversation
The optimization achieves a **896% speedup** through two key changes:
**1. Added `__slots__ = ("_callbacks",)` to WorkflowCallbacksManager**
This eliminates the instance `__dict__` overhead, reducing memory usage and improving attribute access speed for the `_callbacks` attribute.
**2. Replaced loop-based registration with bulk extension**
The original code used a loop to call `register()` for each callback:
```python
for callback in callbacks or []:
manager.register(callback) # 6,141 calls to append()
```
The optimized version directly extends the internal list when callbacks exist:
```python
if callbacks:
manager._callbacks.extend(callbacks) # Single bulk operation
```
**Why this is faster:**
- **Eliminates method call overhead**: Instead of 6,141+ calls to `register()` → `append()`, there's just one `extend()` call
- **Reduces loop iterations**: The original profiler shows the loop (`for callback in callbacks or []`) consumed 7.8% of runtime across 6,181 iterations
- **Bulk list operations**: `extend()` is optimized at the C level for adding multiple items efficiently
**Performance impact by test case:**
- **Large-scale tests see massive gains**: Tests with 1000+ callbacks show 1746-2938% speedup
- **Small-scale tests see modest gains**: Basic tests with few callbacks show 6-55% speedup
- **Edge cases benefit consistently**: All test scenarios show improvement, with the optimization being most effective when processing many callbacks at once
The optimization is particularly effective for workflows that register many callbacks simultaneously, which appears to be the common use case based on the test results.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📄 897% (8.97x) speedup for
create_callback_chainingraphrag/index/run/utils.py⏱️ Runtime :
266 microseconds→26.7 microseconds(best of102runs)📝 Explanation and details
The optimization achieves a 896% speedup through two key changes:
1. Added
__slots__ = ("_callbacks",)to WorkflowCallbacksManagerThis eliminates the instance
__dict__overhead, reducing memory usage and improving attribute access speed for the_callbacksattribute.2. Replaced loop-based registration with bulk extension
The original code used a loop to call
register()for each callback:The optimized version directly extends the internal list when callbacks exist:
Why this is faster:
register()→append(), there's just oneextend()callfor callback in callbacks or []) consumed 7.8% of runtime across 6,181 iterationsextend()is optimized at the C level for adding multiple items efficientlyPerformance impact by test case:
The optimization is particularly effective for workflows that register many callbacks simultaneously, which appears to be the common use case based on the test results.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-create_callback_chain-mglk2zhgand push.