⚡️ Speed up function `_find_class_node` by 14% in PR #1199 (`omni-java`) #1252

codeflash-ai · 2026-02-02T00:29:49Z

⚡️ This pull request contains optimizations for PR #1199

If you approve this dependent PR, these changes will be merged into the original PR branch omni-java.

This PR will be automatically closed if the original PR is merged.

📄 14% (0.14x) speedup for `_find_class_node` in `codeflash/languages/java/context.py`

⏱️ Runtime : 12.3 microseconds → 10.8 microseconds (best of 91 runs)

📝 Explanation and details

The optimized code achieves a 14% runtime improvement by eliminating redundant work in a recursive function that traverses abstract syntax trees.

Key Optimization:

The primary performance gain comes from moving the type_declarations dictionary to module-level as _TYPE_DECLARATIONS. In the original code, this dictionary was recreated on every recursive call (622 times based on profiler data), consuming ~36% of the function's runtime (lines allocating the dictionary took 8.8% + 6.2% + 6.4% + 5.8% = 27.2% combined). By creating it once at module load time, this overhead is completely eliminated.

Additional Micro-optimization:

The code also caches node.type in a local variable node_type before the dictionary lookup. While this provides minimal benefit (~1-2% based on profiler differences), it slightly reduces attribute access overhead in the hot path where node.type would otherwise be accessed twice (once for the in check, once for the dictionary lookup on match).

Why This Works:

The function performs recursive tree traversal, visiting each node exactly once. Since the type_declarations mapping is constant, recreating it 622 times (once per node visited) is pure waste. Python dictionary creation, even for small dictionaries, involves memory allocation and hash table setup - overhead that compounds significantly in recursive scenarios.

Test Case Performance:

The optimization shows consistent improvements across all test cases (7-20% faster), with the most significant gains in simpler cases like test_basic_single_class_found (19.8% faster) and test_missing_name_field_does_not_crash_and_returns_none (16.4% faster). These cases benefit most because a higher percentage of their runtime was spent on dictionary creation relative to other operations. The UTF-8 test case shows smaller gains (11%) because more time is spent in string decoding operations.

Impact:

This optimization is particularly valuable when _find_type_node (or its wrapper _find_class_node) is called frequently on large ASTs, as the savings multiply with tree size and call frequency. The function appears to be used for locating Java type declarations in parsed source code - a common operation in code analysis tools that could be invoked many times during batch processing.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 9 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Click to see Generated Regression Tests

from __future__ import annotations

from typing import Optional

# imports
import pytest  # used for our unit tests
from codeflash.languages.java.context import _find_class_node
from tree_sitter import Node

# Note: We preserve the exact original function implementation above.
# The tests below will exercise that implementation using a controlled Node-like structure.

# --- Test helpers (lightweight Node-like objects) ---
# We create simple objects that mimic the interface used by the function under test:
# the code accesses node.type, node.children (iterable), and node.child_by_field_name("name").
# These helper classes are not intended to replace real domain classes elsewhere in the codebase;
# they only provide the small API surface required by the function under test.

class NameNode:
    """A minimal object representing a name token with start/end byte offsets."""
    def __init__(self, start_byte: int, end_byte: int):
        self.start_byte = start_byte
        self.end_byte = end_byte
        # children attribute present so recursion can iterate without AttributeError
        self.children = []

    def child_by_field_name(self, name: str):
        # Name nodes do not have further named children in our test scaffolding.
        return None

class SimpleNode:
    """A lightweight stand-in for tree_sitter.Node that supports the needed API.

    Attributes:
        type: the node.type string (e.g., "class_declaration", "interface_declaration", ...)
        children: list of child nodes (SimpleNode or NameNode)
        _fields: dict mapping field names -> node (used to implement child_by_field_name)
    """
    def __init__(self, type: str, children: Optional[list] = None, fields: Optional[dict] = None):
        self.type = type
        self.children = children if children is not None else []
        self._fields = fields if fields is not None else {}

    def child_by_field_name(self, name: str):
        # Return the node bound to 'name' if any; otherwise None
        return self._fields.get(name)

def test_basic_single_class_found():
    # Basic scenario: a single class declaration at the root with a name that should be found.
    # Construct source bytes and a name node that points to the bytes for 'Foo'.
    source = b'class Foo {}'
    # 'Foo' starts at byte offset 6 and ends at 9 in the bytes above.
    name_node = NameNode(start_byte=6, end_byte=9)
    # Create a class_declaration node that exposes the name under the "name" field.
    class_node = SimpleNode("class_declaration", children=[name_node], fields={"name": name_node})
    # Root is some wrapper containing the class declaration as a child.
    root = SimpleNode("program", children=[class_node])

    # Call the function under test. It should return the exact class_node object.
    codeflash_output = _find_class_node(root, "Foo", source); found = codeflash_output # 2.79μs -> 2.32μs (19.8% faster)

def test_interface_and_enum_nodes_are_returned_by_name_search():
    # The implementation searches for class, interface, and enum nodes.
    # Ensure that searching for an interface name returns the interface node.
    source = b'interface ITest {} enum ETest {}'
    i_name = NameNode(start_byte=10, end_byte=15)  # 'ITest'
    e_name = NameNode(start_byte=25, end_byte=30)  # 'ETest'
    interface_node = SimpleNode("interface_declaration", children=[i_name], fields={"name": i_name})
    enum_node = SimpleNode("enum_declaration", children=[e_name], fields={"name": e_name})
    root = SimpleNode("program", children=[interface_node, enum_node])

    codeflash_output = _find_class_node(root, "ITest", source); found_i = codeflash_output

    codeflash_output = _find_class_node(root, "ETest", source); found_e = codeflash_output

def test_missing_name_field_does_not_crash_and_returns_none():
    # Edge case: a class_declaration exists but has no 'name' child. Searching for any name should skip it.
    source = b'class {}'  # no name
    class_node = SimpleNode("class_declaration", children=[], fields={})
    root = SimpleNode("program", children=[class_node])

    codeflash_output = _find_class_node(root, "Anything", source); found = codeflash_output # 1.92μs -> 1.65μs (16.4% faster)

def test_multiple_same_named_nodes_returns_first_match_preorder():
    # When multiple matching declarations with the same name exist, the function
    # should return the first encountered in a depth-first (preorder) traversal.
    source = b'class A {} class A {}'
    # First 'A' at bytes 6:7:8 - offsets chosen arbitrarily to simulate positions
    first_name = NameNode(start_byte=6, end_byte=7)
    second_name = NameNode(start_byte=16, end_byte=17)
    first_class = SimpleNode("class_declaration", children=[first_name], fields={"name": first_name})
    second_class = SimpleNode("class_declaration", children=[second_name], fields={"name": second_name})
    root = SimpleNode("program", children=[first_class, second_class])

    codeflash_output = _find_class_node(root, "A", source); found = codeflash_output # 2.54μs -> 2.17μs (16.6% faster)

def test_utf8_name_handling():
    # Ensure decoding with UTF-8 works (non-ASCII characters in type names).
    # Use a name that includes a non-ASCII character, e.g. 'Café'
    name_text = "Caf\u00e9"  # Café
    source = ("class " + name_text + " {}").encode("utf8")
    # find byte offsets of the name within the encoded bytes
    start = source.index(name_text.encode("utf8"))
    end = start + len(name_text.encode("utf8"))
    name_node = NameNode(start_byte=start, end_byte=end)
    class_node = SimpleNode("class_declaration", children=[name_node], fields={"name": name_node})
    root = SimpleNode("program", children=[class_node])

    codeflash_output = _find_class_node(root, name_text, source); found = codeflash_output # 3.04μs -> 2.73μs (11.0% faster)

def test_large_scale_tree_find_and_not_found():
    # Large-scale-ish test: build a tree with many sibling/child nodes (well under 1000)
    # and verify search still finds the target and properly returns None when missing.
    source_parts = []
    children = []

    # Build 300 dummy nodes named 'N0', 'N1', ..., each represented just as non-matching nodes.
    # We will place the one matching class near the end to ensure full traversal is required.
    for i in range(300):
        # Add a dummy node of some other type that will not be matched.
        dummy = SimpleNode("expression", children=[])
        children.append(dummy)
        source_parts.append(f"/*{i}*/")

    # Add the target class near the end.
    target_name = "TargetClass"
    prefix = "class "
    suffix = " {}"
    # Build source bytes so the name appears at a calculable position
    # concatenate prefix + name + suffix and append to parts
    source_parts.append(prefix + target_name + suffix)
    # Compute combined source bytes for slicing by aggregating all parts
    combined_source = "".join(source_parts).encode("utf8")

    # Determine byte offset where TargetClass name begins
    combined_text = "".join(source_parts)
    name_start = combined_text.index(target_name)
    name_end = name_start + len(target_name.encode("utf8"))

    # Create nodes: wrap many dummies and then the class node
    target_name_node = NameNode(start_byte=name_start, end_byte=name_end)
    target_class_node = SimpleNode("class_declaration", children=[target_name_node], fields={"name": target_name_node})
    # Put many dummy nodes as children of root, then the target_class_node
    root_children = children + [target_class_node]
    root = SimpleNode("program", children=root_children)

    # Ensure the function can find the target in a larger structure.
    codeflash_output = _find_class_node(root, target_name, combined_source); found = codeflash_output

    # Also test that searching for a non-existent name returns None on the same large tree
    codeflash_output = _find_class_node(root, "NoSuchClass", combined_source); not_found = codeflash_output

def test_root_node_is_target_and_children_are_not_searched_unnecessarily():
    # If root itself is a matching class_declaration, the function should return it immediately
    # without searching other children for the same name.
    # Construct source where root is the class.
    source = b'class Root {} class Other {}'
    root_name = NameNode(start_byte=6, end_byte=10)  # 'Root'
    root_class = SimpleNode("class_declaration", children=[root_name], fields={"name": root_name})
    # Even if children contain another class with the same name, the root match should be preferred
    child_name = NameNode(start_byte=18, end_byte=23)  # 'Other'
    child_class = SimpleNode("class_declaration", children=[child_name], fields={"name": child_name})
    # Make the root itself the node passed in (not a wrapper 'program') to simulate direct match.
    root_class.children.append(child_class)  # child present but should not override root match
    codeflash_output = _find_class_node(root_class, "Root", source); found = codeflash_output # 2.04μs -> 1.90μs (7.41% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1199-2026-02-02T00.29.43 and push.

The optimized code achieves a **14% runtime improvement** by eliminating redundant work in a recursive function that traverses abstract syntax trees. **Key Optimization:** The primary performance gain comes from moving the `type_declarations` dictionary to module-level as `_TYPE_DECLARATIONS`. In the original code, this dictionary was recreated on every recursive call (622 times based on profiler data), consuming ~36% of the function's runtime (lines allocating the dictionary took 8.8% + 6.2% + 6.4% + 5.8% = 27.2% combined). By creating it once at module load time, this overhead is completely eliminated. **Additional Micro-optimization:** The code also caches `node.type` in a local variable `node_type` before the dictionary lookup. While this provides minimal benefit (~1-2% based on profiler differences), it slightly reduces attribute access overhead in the hot path where `node.type` would otherwise be accessed twice (once for the `in` check, once for the dictionary lookup on match). **Why This Works:** The function performs recursive tree traversal, visiting each node exactly once. Since the type_declarations mapping is constant, recreating it 622 times (once per node visited) is pure waste. Python dictionary creation, even for small dictionaries, involves memory allocation and hash table setup - overhead that compounds significantly in recursive scenarios. **Test Case Performance:** The optimization shows consistent improvements across all test cases (7-20% faster), with the most significant gains in simpler cases like `test_basic_single_class_found` (19.8% faster) and `test_missing_name_field_does_not_crash_and_returns_none` (16.4% faster). These cases benefit most because a higher percentage of their runtime was spent on dictionary creation relative to other operations. The UTF-8 test case shows smaller gains (11%) because more time is spent in string decoding operations. **Impact:** This optimization is particularly valuable when `_find_type_node` (or its wrapper `_find_class_node`) is called frequently on large ASTs, as the savings multiply with tree size and call frequency. The function appears to be used for locating Java type declarations in parsed source code - a common operation in code analysis tools that could be invoked many times during batch processing.

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Feb 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ Speed up function `_find_class_node` by 14% in PR #1199 (`omni-java`) #1252

⚡️ Speed up function `_find_class_node` by 14% in PR #1199 (`omni-java`) #1252

Uh oh!

codeflash-ai bot commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function _find_class_node by 14% in PR #1199 (omni-java) #1252

Are you sure you want to change the base?

⚡️ Speed up function _find_class_node by 14% in PR #1199 (omni-java) #1252

Uh oh!

Conversation

codeflash-ai bot commented Feb 2, 2026

⚡️ This pull request contains optimizations for PR #1199

📄 14% (0.14x) speedup for _find_class_node in codeflash/languages/java/context.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function `_find_class_node` by 14% in PR #1199 (`omni-java`) #1252

⚡️ Speed up function `_find_class_node` by 14% in PR #1199 (`omni-java`) #1252

📄 14% (0.14x) speedup for `_find_class_node` in `codeflash/languages/java/context.py`