Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 1, 2026

⚡️ This pull request contains optimizations for PR #1227

If you approve this dependent PR, these changes will be merged into the original PR branch limit-install-version.

This PR will be automatically closed if the original PR is merged.


📄 77% (0.77x) speedup for function_has_return_statement in codeflash/discovery/functions_to_optimize.py

⏱️ Runtime : 1.29 milliseconds 725 microseconds (best of 41 runs)

📝 Explanation and details

The optimization achieves a 77% speedup (from 1.29ms to 725μs) by restructuring the depth-first search to check the most common locations for return statements first, avoiding unnecessary traversal overhead.

Key Optimizations

  1. Fast-path for top-level returns: The optimized version first scans function_node.body directly before initiating the full DFS. Since most functions with returns have them at the top level, this short-circuits the expensive ast.iter_child_nodes() calls in the majority of cases.

  2. Reduced stack initialization overhead: Instead of initializing the stack with [function_node] and then iterating over its children, the optimized code starts the stack with list(body), skipping the wrapper function node entirely. This saves one unnecessary iteration.

  3. Early empty-body check: By checking if not body upfront, the code avoids creating an empty stack and entering the while loop for functions with no statements.

Performance Impact by Test Pattern

The optimization excels when:

  • Return is at top-level (e.g., simple functions with direct returns): 300-500% faster - the fast-path loop finds the return immediately without DFS overhead
  • Return is early in a large function: 3,800-26,000% faster for functions with 100+ statements - avoids traversing all subsequent AST nodes
  • Functions without returns but minimal nesting: 10-20% faster - benefits from reduced stack initialization overhead

The optimization shows minimal or slight regression when:

  • Return is deeply nested (e.g., inside if/try/for blocks at level 2+): 0-5% slower - the fast-path check adds overhead before falling back to DFS
  • Very complex nested structures: ~4% slower - the additional top-level scan doesn't help when returns are buried deep

Line Profiler Evidence

The key improvement is visible in the line profiler: ast.iter_child_nodes() was called 1,366 times (82.4% of runtime) in the original versus 679 times (73.2% of runtime) in the optimized version - nearly a 50% reduction in expensive child node iterations, achieved by the fast-path detecting returns before the full DFS begins.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 69 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from __future__ import annotations

import ast  # used to parse Python source into AST nodes for testing
import textwrap  # used to dedent multi-line source strings for readability
from _ast import (  # types referenced by the function under test
    AsyncFunctionDef, FunctionDef)

# imports
import pytest  # used for our unit tests
from codeflash.discovery.functions_to_optimize import \
    function_has_return_statement

# unit tests

def _get_first_node(src: str):
    """
    Helper: parse source and return the first top-level AST node.
    We keep this small to avoid repeating the parse logic.
    """
    module = ast.parse(textwrap.dedent(src))
    return module.body[0]

def test_basic_function_with_return_value():
    # Basic case: a simple function containing 'return 1' should be detected.
    src = """
    def f():
        return 1
    """
    node = _get_first_node(src)  # get FunctionDef node
    # The function contains a Return node directly in its body -> True
    codeflash_output = function_has_return_statement(node) # 4.19μs -> 952ns (340% faster)

def test_basic_function_with_bare_return():
    # Basic case: a bare 'return' (no value) is still an ast.Return node.
    src = """
    def f():
        return
    """
    node = _get_first_node(src)
    codeflash_output = function_has_return_statement(node) # 4.13μs -> 941ns (339% faster)

def test_function_without_return():
    # Basic negative case: no return statements anywhere in the function -> False
    src = """
    def f():
        x = 10
        y = x + 5
        z = (a for a in range(3))  # generator expression, not a Return
    """
    node = _get_first_node(src)
    codeflash_output = function_has_return_statement(node) # 25.7μs -> 23.3μs (10.4% faster)

def test_return_in_if_branch_detected():
    # Edge: return nested in an if block should be found
    src = """
    def f(x):
        if x > 0:
            return x
        else:
            x = -x
    """
    node = _get_first_node(src)
    codeflash_output = function_has_return_statement(node) # 10.9μs -> 11.2μs (2.68% slower)

def test_return_in_try_except_else_finally_variants():
    # Edge: returns inside try/except/else/finally blocks should be found
    variants = [
        # return in try
        "def f():\n    try:\n        return 1\n    finally:\n        pass\n",
        # return in except
        "def f():\n    try:\n        1/0\n    except Exception:\n        return 2\n",
        # return in else
        "def f():\n    try:\n        x = 1\n    except Exception:\n        pass\n    else:\n        return 3\n",
        # return in finally (even though finally always runs)
        "def f():\n    try:\n        pass\n    finally:\n        return 4\n",
    ]
    for src in variants:
        node = _get_first_node(src)
        codeflash_output = function_has_return_statement(node) # 19.6μs -> 15.5μs (26.5% faster)

def test_return_in_nested_function_counts_due_to_dfs():
    # Important behavioral edge: the implementation does a DFS over the entire subtree.
    # A Return in a nested inner function will be seen by this implementation,
    # so this test documents and asserts that behavior.
    src = """
    def outer():
        x = 1
        def inner():
            return 42
        y = x + 2
    """
    node = _get_first_node(src)
    # The Return exists in inner(), and function_has_return_statement will find it -> True
    codeflash_output = function_has_return_statement(node) # 12.4μs -> 10.9μs (13.7% faster)

def test_return_in_inner_class_method_counts():
    # Another DFS consequence: returns inside methods of a nested class are found.
    src = """
    def outer():
        class C:
            def method(self):
                return 'hi'
        inst = C()
    """
    node = _get_first_node(src)
    codeflash_output = function_has_return_statement(node) # 12.3μs -> 12.0μs (2.60% faster)

def test_async_function_with_return():
    # AsyncFunctionDef is handled equally; async def containing return should be detected.
    src = """
    async def af():
        return 'async'
    """
    node = _get_first_node(src)
    # The function contains a Return -> True
    codeflash_output = function_has_return_statement(node) # 3.99μs -> 922ns (332% faster)

def test_lambda_does_not_create_return_nodes():
    # Lambdas do not use ast.Return nodes; ensure a function containing only a lambda is False.
    src = """
    def f():
        func = lambda x: x + 1
        return_value = (x for x in range(2))
    """
    node = _get_first_node(src)
    # There is no ast.Return in the function body (the 'lambda' does not create one),
    # so the function should report False.
    codeflash_output = function_has_return_statement(node) # 26.7μs -> 24.9μs (7.19% faster)

def test_generator_function_with_yield_not_counted_as_return():
    # yield statements are ast.Yield / ast.YieldFrom, not ast.Return.
    # A generator function that only yields should therefore return False.
    src = """
    def gen():
        for i in range(3):
            yield i
    """
    node = _get_first_node(src)
    codeflash_output = function_has_return_statement(node) # 15.6μs -> 12.8μs (22.2% faster)

def test_bare_return_vs_return_value_both_detected():
    # Ensure both 'return' (no value) and 'return expr' are ast.Return nodes and detected.
    src1 = "def a():\n    return\n"
    src2 = "def b():\n    return 999\n"
    n1 = _get_first_node(src1)
    n2 = _get_first_node(src2)
    codeflash_output = function_has_return_statement(n1) # 3.84μs -> 931ns (312% faster)
    codeflash_output = function_has_return_statement(n2) # 2.19μs -> 451ns (386% faster)

def test_passing_return_node_directly_also_returns_true():
    # The implementation accepts an AST node and will immediately return True if the node itself
    # is an ast.Return (since it is placed on the initial stack). This asserts that behavior.
    ret_node = ast.parse("x = 1\nreturn 2", mode="exec").body[1]  # second statement is a Return
    # Passing the Return node itself should produce True
    codeflash_output = function_has_return_statement(ret_node) # 731ns -> 461ns (58.6% faster)

def test_multiple_returns_in_different_branches_detected():
    # Function with multiple return statements in different branches -> True
    src = """
    def f(x):
        if x == 0:
            return 0
        for i in range(x):
            if i == 5:
                return i
        try:
            pass
        except:
            return -1
    """
    node = _get_first_node(src)
    codeflash_output = function_has_return_statement(node) # 8.10μs -> 6.51μs (24.3% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import ast
from _ast import AsyncFunctionDef, FunctionDef

# imports
import pytest
from codeflash.discovery.functions_to_optimize import \
    function_has_return_statement

def test_simple_function_with_return():
    """Test a basic function with a simple return statement."""
    code = "def foo():\n    return 42"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.09μs -> 981ns (317% faster)

def test_simple_function_without_return():
    """Test a basic function without any return statement."""
    code = "def foo():\n    x = 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 10.9μs -> 7.51μs (45.1% faster)

def test_function_with_return_none():
    """Test a function with explicit return None."""
    code = "def foo():\n    return None"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.89μs -> 882ns (341% faster)

def test_function_with_return_expression():
    """Test a function with return of a complex expression."""
    code = "def foo():\n    return x + y * 2"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.93μs -> 882ns (345% faster)

def test_async_function_with_return():
    """Test an async function with a return statement."""
    code = "async def foo():\n    return 42"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.28μs -> 881ns (386% faster)

def test_async_function_without_return():
    """Test an async function without return."""
    code = "async def foo():\n    await something()"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 10.7μs -> 7.68μs (39.6% faster)

def test_function_with_return_in_if():
    """Test a function with return inside an if block."""
    code = "def foo(x):\n    if x > 0:\n        return x"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.53μs -> 4.30μs (28.7% faster)

def test_function_with_return_in_else():
    """Test a function with return inside an else block."""
    code = "def foo(x):\n    if x > 0:\n        pass\n    else:\n        return x"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.56μs -> 4.19μs (32.8% faster)

def test_function_with_return_in_for_loop():
    """Test a function with return inside a for loop."""
    code = "def foo(items):\n    for item in items:\n        return item"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.95μs -> 4.61μs (29.1% faster)

def test_function_with_return_in_while_loop():
    """Test a function with return inside a while loop."""
    code = "def foo():\n    while True:\n        return 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.17μs -> 4.08μs (26.8% faster)

def test_function_with_return_in_try_except():
    """Test a function with return in try block."""
    code = "def foo():\n    try:\n        return 1\n    except:\n        pass"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 7.60μs -> 7.25μs (4.82% faster)

def test_function_with_return_in_except():
    """Test a function with return in except block."""
    code = "def foo():\n    try:\n        x = 1\n    except:\n        return 2"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 7.17μs -> 6.04μs (18.7% faster)

def test_function_with_return_in_finally():
    """Test a function with return in finally block."""
    code = "def foo():\n    try:\n        x = 1\n    finally:\n        return 2"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.81μs -> 4.31μs (34.9% faster)

def test_function_with_return_in_with_statement():
    """Test a function with return inside a with statement."""
    code = "def foo():\n    with open('file') as f:\n        return f.read()"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.82μs -> 4.54μs (28.2% faster)

def test_function_with_multiple_returns():
    """Test a function with multiple return statements."""
    code = "def foo(x):\n    if x:\n        return 1\n    else:\n        return 2"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.55μs -> 4.33μs (28.2% faster)

def test_function_with_nested_if_statements():
    """Test a function with deeply nested if statements and return."""
    code = "def foo(x):\n    if x:\n        if x > 0:\n            return 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 6.33μs -> 5.52μs (14.7% faster)

def test_function_with_nested_functions_no_outer_return():
    """Test a function with nested function definition but no return in outer function."""
    code = "def foo():\n    def bar():\n        return 1\n    x = 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    # The function should still find the return in the nested function
    # because it traverses all child nodes
    codeflash_output = function_has_return_statement(func_node) # 9.67μs -> 9.07μs (6.63% faster)

def test_function_with_nested_functions_with_outer_return():
    """Test a function with nested function and return in outer function."""
    code = "def foo():\n    def bar():\n        return 1\n    return 2"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.98μs -> 1.00μs (297% faster)

def test_empty_function():
    """Test an empty function (pass statement)."""
    code = "def foo():\n    pass"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 7.06μs -> 3.28μs (116% faster)

def test_function_with_only_docstring():
    """Test a function with only a docstring."""
    code = 'def foo():\n    """This is a docstring"""'
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 7.84μs -> 4.46μs (76.0% faster)

def test_function_with_docstring_and_return():
    """Test a function with docstring and return statement."""
    code = 'def foo():\n    """Docstring"""\n    return 1'
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.92μs -> 1.02μs (283% faster)

def test_function_with_decorator():
    """Test that decorators don't affect return detection."""
    code = "@decorator\ndef foo():\n    return 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.59μs -> 901ns (520% faster)

def test_function_with_type_annotations():
    """Test a function with type annotations."""
    code = "def foo(x: int) -> int:\n    return x + 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.48μs -> 942ns (482% faster)

def test_function_with_default_arguments():
    """Test a function with default arguments."""
    code = "def foo(x=1, y=2):\n    return x + y"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.87μs -> 922ns (319% faster)

def test_function_with_varargs():
    """Test a function with *args and **kwargs."""
    code = "def foo(*args, **kwargs):\n    return args[0]"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.97μs -> 962ns (312% faster)

def test_function_with_list_comprehension():
    """Test a function with list comprehension but no return."""
    code = "def foo():\n    [x for x in range(10)]"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 15.8μs -> 13.1μs (20.0% faster)

def test_function_with_dict_comprehension():
    """Test a function with dict comprehension but no return."""
    code = "def foo():\n    {x: x*2 for x in range(10)}"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 19.2μs -> 16.8μs (14.1% faster)

def test_function_with_generator_expression():
    """Test a function with generator expression but no return."""
    code = "def foo():\n    (x for x in range(10))"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 15.4μs -> 12.6μs (22.1% faster)

def test_function_with_yield():
    """Test a generator function with yield (not return)."""
    code = "def foo():\n    yield 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    # yield is not a return statement, so this should be False
    codeflash_output = function_has_return_statement(func_node) # 9.10μs -> 5.21μs (74.6% faster)

def test_function_with_yield_and_return():
    """Test a generator function with both yield and return."""
    code = "def foo():\n    yield 1\n    return"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.96μs -> 1.04μs (280% faster)

def test_function_with_return_in_lambda():
    """Test a function containing a lambda (lambdas can have implicit returns)."""
    code = "def foo():\n    f = lambda x: x + 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    # The lambda body is not a return statement, it's an expression
    codeflash_output = function_has_return_statement(func_node) # 16.7μs -> 14.3μs (16.4% faster)

def test_function_with_class_definition():
    """Test a function with inner class definition."""
    code = "def foo():\n    class Bar:\n        return 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    # Note: return statements inside class bodies are actually invalid Python,
    # but we're testing the AST traversal here
    try:
        # This will fail to parse, so let's use valid code instead
        code = "def foo():\n    class Bar:\n        def baz(self):\n            return 1"
        tree = ast.parse(code)
        func_node = tree.body[0]
        # The return is in a nested function inside a class, so it should be found
        codeflash_output = function_has_return_statement(func_node)
    except SyntaxError:
        pass

def test_function_with_return_in_elif():
    """Test a function with return in elif block."""
    code = "def foo(x):\n    if x < 0:\n        pass\n    elif x == 0:\n        return 0\n    else:\n        return 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 6.63μs -> 5.74μs (15.5% faster)

def test_function_with_boolean_operators():
    """Test a function with boolean operators in return."""
    code = "def foo(x, y):\n    return x and y or z"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.01μs -> 972ns (312% faster)

def test_function_with_ternary_operator():
    """Test a function with ternary operator in return."""
    code = "def foo(x):\n    return 1 if x else 2"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.90μs -> 932ns (318% faster)

def test_function_with_function_call_in_return():
    """Test a function with function call in return."""
    code = "def foo():\n    return some_function(arg1, arg2)"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.95μs -> 902ns (338% faster)

def test_function_with_return_in_match_statement():
    """Test a function with return in match statement (Python 3.10+)."""
    code = "def foo(x):\n    match x:\n        case 0:\n            return 'zero'\n        case _:\n            return 'other'"
    try:
        tree = ast.parse(code)
        func_node = tree.body[0]
        codeflash_output = function_has_return_statement(func_node)
    except SyntaxError:
        # Python < 3.10 doesn't support match statements
        pass

def test_function_with_return_statement_empty():
    """Test a function with bare return (no value)."""
    code = "def foo():\n    if True:\n        return"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.39μs -> 4.09μs (31.9% faster)

def test_function_with_multiple_nested_scopes():
    """Test a function with multiple nested scopes."""
    code = """def foo():
    if True:
        for i in range(10):
            while True:
                try:
                    with open('file') as f:
                        return 1
                except:
                    pass"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 13.2μs -> 13.2μs (0.152% faster)

def test_function_with_no_return_deeply_nested():
    """Test a deeply nested function without return."""
    code = """def foo():
    if True:
        for i in range(10):
            while True:
                try:
                    x = 1
                except:
                    pass"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 22.9μs -> 21.4μs (7.36% faster)

def test_function_with_large_number_of_statements():
    """Test a function with a large number of sequential statements."""
    statements = "\n    ".join([f"x{i} = {i}" for i in range(100)])
    code = f"def foo():\n    {statements}\n    return x99"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 10.6μs -> 5.93μs (78.9% faster)

def test_function_with_large_number_of_statements_no_return():
    """Test a function with many statements but no return."""
    statements = "\n    ".join([f"x{i} = {i}" for i in range(100)])
    code = f"def foo():\n    {statements}"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 278μs -> 269μs (3.48% faster)

def test_function_with_many_nested_functions():
    """Test a function with many nested function definitions."""
    nested_defs = "\n".join([f"    def nested_{i}():\n        pass" for i in range(50)])
    code = f"def foo():\n{nested_defs}\n    return 1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 7.57μs -> 3.72μs (104% faster)

def test_function_with_large_if_elif_chain():
    """Test a function with a large if-elif-else chain."""
    elif_chain = "\n    ".join([f"elif x == {i}:\n        return {i}" for i in range(1, 50)])
    code = f"def foo(x):\n    if x == 0:\n        return 0\n    {elif_chain}\n    else:\n        return -1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 54.7μs -> 54.1μs (1.09% faster)

def test_function_with_large_try_except_chain():
    """Test a function with multiple except blocks."""
    except_blocks = "\n    ".join([f"except ValueError:\n        return {i}" for i in range(1, 30)])
    code = f"""def foo():
    try:
        return 0
    {except_blocks}"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 9.38μs -> 8.27μs (13.5% faster)

def test_function_with_complex_nested_structure():
    """Test a function with complex nested structure combining loops, conditions, and try-except."""
    code = """def foo():
    for i in range(10):
        if i % 2 == 0:
            try:
                for j in range(10):
                    if j > 5:
                        return i * j
            except:
                for k in range(10):
                    pass"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 18.9μs -> 19.7μs (4.31% slower)

def test_function_with_large_list_of_nested_ifs():
    """Test performance with many sequential if blocks."""
    if_blocks = "\n    ".join([f"if x == {i}:\n        y = {i}" for i in range(50)])
    code = f"def foo(x):\n    {if_blocks}\n    return y"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 7.62μs -> 3.63μs (110% faster)

def test_function_with_return_at_end_of_large_function():
    """Test that return statement is found even at the end of a large function."""
    statements = "\n    ".join([f"x{i} = {i}" for i in range(100)])
    code = f"def foo():\n    {statements}\n    return sum([x0, x1, x2])"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 10.7μs -> 5.98μs (78.6% faster)

def test_function_with_return_at_beginning_of_large_function():
    """Test that return statement is found early in a large function."""
    statements = "\n    ".join([f"x{i} = {i}" for i in range(100)])
    code = f"def foo():\n    return 1\n    {statements}"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 274μs -> 1.04μs (26248% faster)

def test_function_with_return_in_middle_of_large_function():
    """Test that return statement is found in the middle of a large function."""
    before = "\n    ".join([f"x{i} = {i}" for i in range(50)])
    after = "\n    ".join([f"y{i} = {i}" for i in range(50, 100)])
    code = f"def foo():\n    {before}\n    return 1\n    {after}"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 141μs -> 3.62μs (3798% faster)

def test_async_function_with_large_nested_structure():
    """Test async function with large nested structure."""
    code = """async def foo():
    for i in range(20):
        if i % 2 == 0:
            try:
                await something()
                return i
            except:
                pass"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 11.1μs -> 10.6μs (4.82% faster)

def test_function_with_many_returns():
    """Test function with many return statements in different branches."""
    returns = "\n    ".join([f"if x == {i}:\n        return {i}" for i in range(50)])
    code = f"def foo(x):\n    {returns}\n    return -1"
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 7.40μs -> 3.56μs (108% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1227-2026-02-01T14.30.56 and push.

Codeflash

The optimization achieves a **77% speedup** (from 1.29ms to 725μs) by restructuring the depth-first search to check the most common locations for return statements first, avoiding unnecessary traversal overhead.

## Key Optimizations

1. **Fast-path for top-level returns**: The optimized version first scans `function_node.body` directly before initiating the full DFS. Since most functions with returns have them at the top level, this short-circuits the expensive `ast.iter_child_nodes()` calls in the majority of cases.

2. **Reduced stack initialization overhead**: Instead of initializing the stack with `[function_node]` and then iterating over its children, the optimized code starts the stack with `list(body)`, skipping the wrapper function node entirely. This saves one unnecessary iteration.

3. **Early empty-body check**: By checking `if not body` upfront, the code avoids creating an empty stack and entering the while loop for functions with no statements.

## Performance Impact by Test Pattern

The optimization excels when:
- **Return is at top-level** (e.g., simple functions with direct returns): **300-500% faster** - the fast-path loop finds the return immediately without DFS overhead
- **Return is early in a large function**: **3,800-26,000% faster** for functions with 100+ statements - avoids traversing all subsequent AST nodes
- **Functions without returns but minimal nesting**: **10-20% faster** - benefits from reduced stack initialization overhead

The optimization shows minimal or slight regression when:
- **Return is deeply nested** (e.g., inside if/try/for blocks at level 2+): **0-5% slower** - the fast-path check adds overhead before falling back to DFS
- **Very complex nested structures**: **~4% slower** - the additional top-level scan doesn't help when returns are buried deep

## Line Profiler Evidence

The key improvement is visible in the line profiler: `ast.iter_child_nodes()` was called **1,366 times** (82.4% of runtime) in the original versus **679 times** (73.2% of runtime) in the optimized version - nearly a 50% reduction in expensive child node iterations, achieved by the fast-path detecting returns before the full DFS begins.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Feb 1, 2026
Base automatically changed from limit-install-version to main February 1, 2026 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant