Skip to content

ci: speed up Claude PR review to cut timeouts#1753

Merged
kevalmorabia97 merged 3 commits into
mainfrom
kmorabia/faster-claude-review
Jun 16, 2026
Merged

ci: speed up Claude PR review to cut timeouts#1753
kevalmorabia97 merged 3 commits into
mainfrom
kmorabia/faster-claude-review

Conversation

@kevalmorabia97

@kevalmorabia97 kevalmorabia97 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

What does this PR do?

Type of change: Bug fix (CI/workflow)

The Claude Code Review job frequently hits the 30-min timeout. I analyzed several recent runs to find why and tuned the workflow to reduce it.

Root cause (from log analysis): the dominant cost is the inference proxy throttling the job429/503/529 responses trigger SDK exponential backoff, which ate ~22 of 30 minutes in the timeout run. It is not PR size: the same 6-file PR ranged 12.5m → 23.7m → 30.5m timeout. Turn count is also not the constraint — a 71-turn run finished in 11.5m while a 25-turn run timed out at 28m (slow turns scattered from the start, consistent with throttling, not context growth).

The lever within the workflow's control is total request volume (each tool round-trip is a separate uncached, rate-limited request). This PR reduces it:

  • Batch independent tool calls into single turns
  • Conditional pre-reads — only read a sub-package's mode.py/config.py/__init__.py when the diff touches registration/config/public API (was unconditional)
  • Scoped diffs instead of one giant gh pr diff; hunks-only reads; don't re-read lines the diff already shows
  • Prioritize modelopt/ > examples/ > tests/; cap large (>50-file) PRs at ~15 files
  • Restrict cross-file symbol tracing to new/renamed public symbols
  • Post findings as you go so partial runs still deliver value
  • Block subagent fan-out (--disallowedTools "Task"); drop --max-turns (the data showed it was the wrong lever)

Testing

Workflow-only change; validated by the run-log analysis described above. Real effect will be visible on the next /claude review invocations.

Out of scope

The dominant remaining fix is infra-side and not addressable here: raise the proxy rate-limit/quota and re-enable prompt caching (DISABLE_PROMPT_CACHING). Tracking separately with the proxy team.

  • Is this change backward compatible?: ✅
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
  • Did you write any new necessary tests?: N/A (CI workflow change)
  • Did you update Changelog?: N/A
  • Did you get Claude approval on this PR?: ❌ (pending)

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Chores
    • Improved the automated code review workflow’s handling of review requests and review scope, including safer treatment of user-provided comment text.
    • Updated the review strategy to use tighter, prioritized investigation and reduce redundant reads, with safeguards for large pull requests.

Note: Internal infrastructure change only—no direct impact on end-user features or functionality.

The Claude review job frequently hit the 30-min timeout. Log analysis of
several runs (incl. a 30.5m timeout) showed the dominant cost is the inference
proxy throttling the job (429/503/529 → SDK backoff ate ~22 of 30 min), not the
PR size — the same 6-file PR ranged 12.5m to a 30.5m timeout. Turn count was not
the constraint (a 71-turn run finished in 11.5m; a 25-turn run timed out).

Reduce request volume (what trips the rate limit) within the workflow:
- Batch independent tool calls into single turns
- Make sub-package mode.py/config.py/__init__.py pre-reads conditional on the
  diff actually touching registration/config/public API
- Scoped diffs instead of one giant 'gh pr diff'; hunks-only reads; don't
  re-read lines the diff already shows
- Prioritize modelopt/ > examples/ > tests/; cap large (>50-file) PRs at ~15 files
- Restrict cross-file symbol tracing to new/renamed public symbols
- Post findings as you go so partial runs still deliver value
- Block subagent fan-out (--disallowedTools Task); drop --max-turns (wrong lever)

The dominant remaining fix is infra-side (raise proxy quota, re-enable prompt
caching) and is out of scope for this workflow change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
@kevalmorabia97 kevalmorabia97 requested a review from a team as a code owner June 16, 2026 18:59
@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 844e93f6-ca87-42f5-b9a7-e6b24399d452

📥 Commits

Reviewing files that changed from the base of the PR and between 1e4db20 and 29f8820.

📒 Files selected for processing (1)
  • .github/workflows/claude_review.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/claude_review.yml

📝 Walkthrough

Walkthrough

The Claude Code Review GitHub Actions workflow prompt is updated to add input safety handling for the triggering comment, disallow the Task tool, rewrite mandatory workflow instructions to batch reads and conditionally inspect sub-package files, and expand the investigation budget section with explicit path prioritization, redundant-read rules, batching guidance, and scoped diff retrieval.

Changes

Claude Review Workflow Prompt & Budget Refinement

Layer / File(s) Summary
Input safety and environment setup
.github/workflows/claude_review.yml
Adds COMMENT_BODY environment variable from the triggering issue comment with explicit safety wording directing Claude to treat it as untrusted, review-scoped text.
Tool policy and mandatory workflow rewrite
.github/workflows/claude_review.yml
Updates Claude action configuration to disallow the Task tool and rewrites the mandatory workflow section to batch independent reads and conditionally read modelopt/torch/<sub-package> files only when the diff touches mode registration, config schema, or public exports.
Investigation budget rules and scoped diff retrieval
.github/workflows/claude_review.yml
Expands investigation budget guidance with explicit path prioritization (modelopt/, examples/, tests/), redundant-read avoidance rules, independent tool-call batching, large-PR file-opening caps with risk-based coverage, and scoped git diff retrieval replacing full-file/full-diff approaches.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title concisely and accurately captures the main objective of the PR: optimizing the Claude PR review workflow to prevent 30-minute timeouts through performance improvements.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns ✅ Passed This is a workflow-only PR modifying .github/workflows/claude_review.yml. No Python code changes in modelopt/ or examples/ packages, no new PIP dependencies, and no security-sensitive patterns. Che...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch kmorabia/faster-claude-review

Comment @coderabbitai help to get the list of available commands and usage tips.

Pass the trigger comment body into the prompt so reviewers can scope a run,
e.g. '/claude review only modelopt/torch files' or 'focus on the export path'.
Previously the comment text was discarded (the prompt was a static template),
so any extra instructions were ignored.

The comment is injected as untrusted input via the COMMENT_BODY env var and the
prompt instructs Claude to treat it only as review-scope guidance, never as
instructions to alter the procedure or approval logic.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
@kevalmorabia97

Copy link
Copy Markdown
Collaborator Author

/claude review

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/workflows/claude_review.yml (1)

88-100: ⚡ Quick win

Prompt injection risk despite mitigation.

While the instructions explicitly tell Claude to treat the comment as untrusted and scope-only, natural-language constraints are notoriously bypassable through prompt injection techniques. A carefully crafted comment could potentially manipulate Claude to skip review steps, alter approval logic, or focus on misleading areas.

Current mitigations (trusted-user restriction, explicit instructions, tool restrictions) reduce but don't eliminate this risk. Consider additional safeguards:

  • Parse the comment body for an explicit allow-list of scope keywords (e.g., "modelopt", "examples", "tests", file patterns) and reject or sanitize other content before substitution.
  • Log the full comment body and Claude's interpreted scope in the workflow output for audit.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/claude_review.yml around lines 88 - 100, The workflow
currently substitutes the COMMENT_BODY environment variable directly into the
reviewer_comment section without validation, creating a prompt injection
vulnerability. Add a validation step before the Claude review task that parses
the comment body and extracts only whitelisted scope keywords (such as file
paths, directory names, or explicit patterns like "modelopt", "tests",
"examples"). Reject or sanitize any content that doesn't match the allow-list.
Additionally, add logging to the workflow output that captures both the full
original comment body and the sanitized scope interpretation that gets passed to
Claude, enabling audit trails and detection of attempted injection attacks. This
validation should occur before the comment is substituted into the
reviewer_comment variable used in the prompt.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In @.github/workflows/claude_review.yml:
- Around line 88-100: The workflow currently substitutes the COMMENT_BODY
environment variable directly into the reviewer_comment section without
validation, creating a prompt injection vulnerability. Add a validation step
before the Claude review task that parses the comment body and extracts only
whitelisted scope keywords (such as file paths, directory names, or explicit
patterns like "modelopt", "tests", "examples"). Reject or sanitize any content
that doesn't match the allow-list. Additionally, add logging to the workflow
output that captures both the full original comment body and the sanitized scope
interpretation that gets passed to Claude, enabling audit trails and detection
of attempted injection attacks. This validation should occur before the comment
is substituted into the reviewer_comment variable used in the prompt.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 84c8e1f9-c2f6-49a9-a650-510f98370b79

📥 Commits

Reviewing files that changed from the base of the PR and between f4896ca and 1e4db20.

📒 Files selected for processing (1)
  • .github/workflows/claude_review.yml

@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.45%. Comparing base (1cccf66) to head (29f8820).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1753   +/-   ##
=======================================
  Coverage   58.45%   58.45%           
=======================================
  Files         510      510           
  Lines       56274    56274           
=======================================
  Hits        32896    32896           
  Misses      23378    23378           
Flag Coverage Δ
unit 54.35% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Review Summary

Findings: CRITICAL: 0 · IMPORTANT: 1 · SUGGESTION: 0

This is a workflow-only change that tunes the /claude review prompt to cut latency/timeouts. The intent is sound and most of the change is prompt-text refinement (batching reads, scoped diffs, conditional pre-reads, file-coverage caps, disallowing the Task tool). One change, however, would backfire against the PR's own goal.

Most impactful finding

[IMPORTANT Compatibility] The newly-recommended three-dot git diff BASE...HEAD hard-fails in this workflow's runtime. The job checks out shallow (fetch-depth: 1) and does a single git fetch origin <base_ref>, so there is no common merge base — three-dot diff errors with fatal: origin/main...HEAD: no merge base. I reproduced this in a clone with the same shallow state. Since step 2 makes this the primary diff command, every review would hit the error, spend turns recovering, and fall back — increasing the turn count/latency this PR aims to reduce. The prior gh pr diff (GitHub API, depth-independent) had no such failure mode.

Fix: keep gh pr diff with --name-only/--patch (shallow-safe, path-filterable), or switch to two-dot git diff origin/<base> HEAD -- ... (works in the shallow clone), or deepen the fetch. See the inline comment for details and trade-offs.

Risk assessment

Low overall risk — CI-only, backward compatible, no source/checkpoint/export impact. The one issue is a correctness bug in a recommended command that undermines the PR's latency objective rather than a safety problem; worth fixing before merge so the optimization actually lands.

🤖 Generated with Claude Code

Comment thread .github/workflows/claude_review.yml Outdated
The review job checks out shallow (fetch-depth: 1) and fetches only the base
tip, so there is no local merge base. The previously recommended three-dot
'git diff <BASE>...HEAD' would fail with 'no merge base' on every run and waste
turns recovering. Switch the scoped diff to two-dot 'git diff <BASE> HEAD --
<paths>', which compares the two tree tips directly and is shallow-safe while
still path-scopable. Caught by the Claude review on this PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
@kevalmorabia97

Copy link
Copy Markdown
Collaborator Author

/claude review focus on the .github/workflows/claude_review.yml changes — verify the two-dot diff fix (29f8820) resolves the shallow-checkout 'no merge base' failure, and re-check the prompt edits for any other commands that would fail in this workflow's shallow runtime.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Review — focused re-check (two-dot diff fix + shallow-runtime safety)

Findings: CRITICAL: 0 · IMPORTANT: 0 · SUGGESTION: 1

Scope applied per the reviewer's request: verify that commit 29f8820 (two-dot git diff) resolves the shallow-checkout 'no merge base' failure, and re-check the prompt edits for any other commands that would fail in this workflow's shallow runtime.

✅ Prior IMPORTANT finding is resolved

I reproduced the workflow's exact runtime state locally (depth-1 checkout of the head SHA plus git fetch origin main, giving git rev-list --count HEAD = 1 with origin/main present as a remote-tracking ref) and confirmed:

  • Two-dot 'git diff origin/main HEAD -- .github/' → succeeds, emits the correct diff.
  • Three-dot 'git diff origin/main...HEAD' → fails with exactly 'fatal: origin/main...HEAD: no merge base'.

The fix in step 2 (line 192, 'git diff HEAD -- modelopt/ examples/') is the working form. The latency-regression risk flagged in the previous review is eliminated.

✅ No other shallow-unsafe commands are instructed

Every command the prompt actually directs the reviewer to run is depth-independent:

  • 'gh pr view … comments,reviews' / '… files …' and 'gh pr review …' are GitHub-API calls (depth-independent).
  • The only instructed git command is the verified two-dot diff.
  • 'git show' / 'git log' appear in --allowedTools but are not steered by any workflow step, so nothing pushes the reviewer toward a shallow-failing invocation.

SUGGESTION (non-blocking)

Two-dot 'git diff origin/main HEAD' compares tree states, so if main advances on a scoped path after the PR branches, unrelated upstream changes can appear in the scoped diff as apparent reversions. For this CI-only PR there is no divergence and the output is clean, and two-dot is strictly better than the hard-failing three-dot form, so this does not block. If noise is ever observed, 'gh pr diff' (already in --allowedTools) is API-based, depth-independent, and yields the exact PR diff — a fine fallback. Not required for merge.

Risk assessment

Low. CI-only, backward compatible, no source/checkpoint/export impact. The previously-flagged blocking issue is verified fixed in the shallow runtime; no remaining blocking issues.

Claude review passed — no blocking issues found. LGTM

🤖 Generated with Claude Code

@kevalmorabia97 kevalmorabia97 enabled auto-merge (squash) June 16, 2026 19:46
@kevalmorabia97 kevalmorabia97 merged commit 07ce8e5 into main Jun 16, 2026
37 checks passed
@kevalmorabia97 kevalmorabia97 deleted the kmorabia/faster-claude-review branch June 16, 2026 20:40
@github-actions

Copy link
Copy Markdown
Contributor
PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-16 20:40 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants