Optimize breaking_changes_checker performance for large packages by Copilot · Pull Request #45937 · Azure/azure-sdk-for-python

Copilot · 2026-03-26T13:00:59Z

The changelog generator is prohibitively slow on large packages like azure-mgmt-network due to redundant file I/O/AST parsing, quadratic algorithms in post-processing, and unnecessary deep copies.

AST caching and early exit

Module-level AST cache (_ast_cache/_get_parsed_module): get_properties() and get_overloads() both parse the same source files per class per base class. Now parsed once, cached by path, shared across both call sites.
Exception-based early exit in ClassTreeAnalyzer: raises _ClassNodeFound to break out of the full ast.NodeVisitor traversal immediately upon finding the target class, rather than continuing to walk the entire tree.

Algorithm fixes

run_async_cleanup(): O(n²m) → O(n). Replaced nested loop + list.remove() with a set of non-aio change keys and a single-pass list comprehension. Handles unhashable types (lists in parameter ordering tuples) via recursive _make_hashable().

# Before: O(n * m * m) — nested scan + linear removal
for change in non_aio_changes:
    for c in changes_list:
        if "aio" in c[2] and change[1] == c[1] and change[3:] == c[3:]:
            changes_list.remove(c)  # O(n)

# After: O(n) — set lookup + filter
non_aio_keys = {_make_key(bc) for bc in changes_list if "aio" not in bc[2]}
changes_list[:] = [bc for bc in changes_list if "aio" not in bc[2] or _make_key(bc) not in non_aio_keys]

get_reportable_changes(): eliminated deepcopy + list.remove(). Single-pass filter builds a new list directly. Suppression objects pre-created once instead of per-change. re.findall() → re.search().

Performance tests

14 new tests in test_performance.py covering correctness and benchmarks:

50K-change async cleanup completes in <2s
5K changes with 100 ignore rules filtered in <1s
Full ChangelogTracker run on large synthetic API surface in <5s
AST cache hit verification and 200-lookup benchmark
Early exit speedup measurement (first-class vs last-class in 500-class file)

… exit, and O(n) algorithms Key optimizations: 1. Add AST module cache (_ast_cache, _get_parsed_module) to avoid redundant file I/O and AST parsing in get_properties() and get_overloads() 2. Use exception-based early exit in ClassTreeAnalyzer to stop AST traversal once the target class is found 3. Replace O(n*m) run_async_cleanup() with O(n) set-based approach 4. Replace deepcopy + list.remove() in get_reportable_changes() with single-pass filtering - eliminates expensive deep copy and O(n^2) removal 5. Use re.search() instead of re.findall() for ignore rule matching 6. Add comprehensive performance tests demonstrating the optimizations Co-authored-by: msyyc <70930885+msyyc@users.noreply.github.com> Agent-Logs-Url: https://github.com/Azure/azure-sdk-for-python/sessions/328bed23-3c8d-44f6-bea0-a24571d01b8d

…hable handling Co-authored-by: msyyc <70930885+msyyc@users.noreply.github.com> Agent-Logs-Url: https://github.com/Azure/azure-sdk-for-python/sessions/328bed23-3c8d-44f6-bea0-a24571d01b8d

…rformance

Copilot

Pull request overview

Improves breaking_changes_checker runtime on large Azure SDK packages by reducing redundant AST parsing and optimizing post-processing algorithms that previously had quadratic behavior.

Changes:

Added AST parsing cache and an early-exit mechanism for locating class nodes during AST traversal.
Optimized run_async_cleanup() and get_reportable_changes() to avoid list.remove()/deepcopy-driven quadratic patterns.
Added a new test_performance.py test module covering correctness and performance/benchmark scenarios.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
`scripts/breaking_changes_checker/detect_breaking_changes.py`	Adds `_ast_cache` + cached parsing helpers and exception-based early exit in `ClassTreeAnalyzer`; clears cache at start of report generation.
`scripts/breaking_changes_checker/breaking_changes_tracker.py`	Reworks async cleanup and ignore-rule filtering to single-pass/set-based implementations.
`scripts/breaking_changes_checker/tests/test_performance.py`	Introduces new performance-focused tests/benchmarks and cache/early-exit verification.

scripts/breaking_changes_checker/tests/test_performance.py

scripts/breaking_changes_checker/breaking_changes_tracker.py

scripts/breaking_changes_checker/tests/test_performance.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…rformance

catalinaperalta

I think overall these look like good improvements to the tool, thanks for working on this! My main concern is that there are some things that are changing the behavior and while they seem correct, I wonder if things will begin breaking due to these changes. To build confidence, have you done a run through of the changelog generation for more mgmt plane libraries to check how it behaves in different scenarios?

catalinaperalta · 2026-03-28T00:48:12Z

scripts/breaking_changes_checker/breaking_changes_tracker.py

+        # Pre-create Suppression objects once instead of recreating per change
+        suppressions = [Suppression(*rule) for rule in ignored]
+
+        # Filter out ignored changes in a single pass instead of deepcopy + list.remove()


I recall there were issues without the deepcopy at some point. How many real packages have you tested this code with?

catalinaperalta · 2026-03-28T00:48:47Z

scripts/breaking_changes_checker/breaking_changes_tracker.py

+            should_keep = True
+            for suppression in suppressions:
                if suppression.parameter_or_property_name is not None:
-                    # If the ignore rule is for a property or parameter, we should check up to that level on the original change


Nit: this command was helpful to understand what we're doing

Copilot AI and others added 2 commits March 26, 2026 08:01

Address code review feedback: improve docstrings, type hints, and has…

3901ba4

…hable handling Co-authored-by: msyyc <70930885+msyyc@users.noreply.github.com> Agent-Logs-Url: https://github.com/Azure/azure-sdk-for-python/sessions/328bed23-3c8d-44f6-bea0-a24571d01b8d

Copilot AI assigned Copilot and msyyc Mar 26, 2026

Copilot created this pull request from a session on behalf of msyyc March 26, 2026 13:01 View session

msyyc marked this pull request as ready for review March 27, 2026 05:22

Merge branch 'main' into copilot/optimize-breaking-changes-checker-pe…

0bba13b

…rformance

msyyc requested a review from catalinaperalta as a code owner March 27, 2026 05:22

Copilot AI review requested due to automatic review settings March 27, 2026 05:22

Copilot started reviewing on behalf of msyyc March 27, 2026 05:24 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

msyyc and others added 4 commits March 27, 2026 15:01

Update scripts/breaking_changes_checker/tests/test_performance.py

925e083

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update scripts/breaking_changes_checker/tests/test_performance.py

de7e115

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Merge branch 'main' into copilot/optimize-breaking-changes-checker-pe…

e828338

…rformance

update changelog level

046ba0c

catalinaperalta reviewed Mar 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize breaking_changes_checker performance for large packages#45937

Optimize breaking_changes_checker performance for large packages#45937
Copilot wants to merge 7 commits intomainfrom
copilot/optimize-breaking-changes-checker-performance

Copilot AI commented Mar 26, 2026 •

edited by msyyc

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

catalinaperalta left a comment •

edited

Loading

Uh oh!

catalinaperalta Mar 28, 2026

Uh oh!

catalinaperalta Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Copilot AI commented Mar 26, 2026 • edited by msyyc Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AST caching and early exit

Algorithm fixes

Performance tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

catalinaperalta left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

catalinaperalta Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

catalinaperalta Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Mar 26, 2026 •

edited by msyyc

Loading

catalinaperalta left a comment •

edited

Loading