Multi-agent PR review: render stall fix, squad persistence, team naming & rename by PureWeen · Pull Request #203 · PureWeen/PolyPilot

PureWeen · 2026-02-23T22:10:22Z

Summary

Fixes render stalls during multi-agent reviews, adds PR Review Squad as a built-in preset, and improves team creation UX.

Render Stall Fix

Debounced JS draft capture — SafeRefreshAsync was calling JS.InvokeAsync("eval") every 50ms, flooding the WebView bridge during 10+ active sessions. Now debounced to every 3 seconds via _lastDraftCaptureTickMs
Event reordering — CompleteResponse now fires OnSessionComplete before OnStateChanged, ensuring HandleComplete populates completedSessions before the throttle check

Squad Group Persistence

CreateGroupFromPresetAsync now calls FlushSaveOrganization() immediately after the debounced save — prevents group loss when app is killed before the 2s timer fires

PR Review Squad (Built-in Preset)

5 workers with multi-model consensus (2+ models must agree)
Worker instructions include: CI status checking, existing review comment awareness, source code verification, git history tracing
SharedContext enforces review standards, RoutingContext handles PR round-robin

Team Creation UX

Name input — optional name field in preset picker (defaults to preset name)
Orchestrator auto-pin — orchestrator session pinned on creation, sorts to top of group
Group rename — inline rename via ✏️ menu item on any group

Input Bar Cleanup

Hidden for Orchestrator and OrchestratorReflect modes in both Dashboard and Sidebar

Tests

CreateGroupFromPresetAsync_PinsOrchestratorSession
BuiltInPresets_IncludePRReviewSquad
CompleteResponse_OnSessionComplete_FiresBeforeOnStateChanged (source-level regression guard)
All 1206 tests pass

Files Changed

Dashboard.razor — debounce, event ordering, input bar hide
CopilotService.Events.cs — completion event reordering
CopilotService.Organization.cs — flush, pin, nameOverride
SessionSidebar.razor + CSS — input bar hide, name input, group rename
ModelCapabilities.cs — PR Review Squad preset, WorkerReviewPrompt
SessionOrganizationTests.cs — new tests
RenderThrottleTests.cs — event ordering guard

When a session completes, SafeRefreshAsync was doing a JS interop round-trip to capture input drafts before every render. With many sessions active (multi-agent reviews), this delayed the final message from appearing for minutes — the JS await blocked the render pipeline while other sessions kept firing events and resetting the timer. Now skips the JS draft capture when completedSessions is non-empty, since the render is for showing final messages, not capturing input. The completedSessions set is cleared after 10 seconds, so normal draft capture resumes for subsequent renders. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@worker

Typing in the orchestrator session's chat box was going through SendPromptAsync (direct chat) instead of SendToMultiAgentGroupAsync. The orchestrator LLM would respond with @worker: assignments but no code parsed or dispatched them — workers never received tasks. Now SendFromCard detects if the session is an orchestrator in an Orchestrator/OrchestratorReflect group and routes through the full pipeline: plan → parse @worker: → dispatch → collect → synthesize. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The group input textarea and Send All button are now hidden when the mode is Orchestrator or OrchestratorReflect since users should type directly in the orchestrator session's chat (which now routes through the pipeline). The input is still shown for Broadcast and Sequential modes where there's no orchestrator session. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The render pipeline was stalling ('stuck on thinking') during multi-agent sessions because SafeRefreshAsync's JS interop eval was running on every 50ms render timer tick. With 10+ active sessions, the constant eval calls flooded the WebView JS bridge, delaying Blazor render diffs for minutes. Two fixes: 1. Debounce JS draft capture (3s minimum interval) — the eval that scrapes all input values from the DOM only needs to run periodically, not on every render cycle. Most render cycles just need StateHasChanged(). 2. Fire OnSessionComplete BEFORE OnStateChanged in CompleteResponse — this ensures HandleComplete populates completedSessions before RefreshState checks the throttle. Previously the throttle often rejected completion renders because completedSessions was still empty. 3. Revert INV-8 violation — removed the InvokeAsync+Task.Yield workaround from HandleComplete that deferred renders (violating the processing-state safety invariant). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

CreateGroupFromPresetAsync called SaveOrganization() which uses a 2s debounce timer. If the process is killed before the timer fires (e.g., relaunch, crash, or SIGTERM), the group is lost and all worker sessions become orphans reassigned to the default group on next startup. Now calls FlushSaveOrganization() immediately after the debounced save to ensure multi-agent group structure survives process termination. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Long-running sessions with heavy tool use hit the model's context window limit, causing 400 Bad Request errors after 5 retries. The Copilot CLI handles this via its compaction mechanism. The SDK exposes this as SessionConfig.InfiniteSessions with an InfiniteSessionConfig type. Setting Enabled=true triggers automatic background compaction when the context approaches the limit, and buffer-exhaustion compaction when it's nearly full. The SDK already emits SessionCompactionStartEvent and SessionCompactionCompleteEvent which PolyPilot handles as timeline-only events — no additional event handling needed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Compaction works without this flag (this session compacted twice at ~135k tokens before the flag was ever set). The InfiniteSessions.Enabled flag may actually mean 'infinite context / no compaction' rather than 'enable compaction', which would be counterproductive. Reverting until the SDK semantics are clarified. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The multi-agent-input-bar div was always rendered for multi-agent groups. For Orchestrator mode it was empty (taking up space), and for OrchestratorReflect it only showed the iterations control. Now: Orchestrator mode renders no bar at all, OrchestratorReflect renders just the iterations control, and Broadcast/Sequential modes still get the textarea + Send All button. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

5 workers with multi-model consensus review. Each worker dispatches 5 sub-agents (Opus, Opus, Sonnet, Gemini, Codex) and applies a 2+ model consensus filter. Orchestrator assigns PRs round-robin and produces a summary table with verdicts. Available in the preset picker for any worktree without needing a .squad/ directory. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ing guard - CreateGroupFromPresetAsync_PinsOrchestratorSession: verifies orchestrator is pinned and workers are not after multi-agent group creation - BuiltInPresets_IncludePRReviewSquad: validates PR Review Squad preset has 5 workers, correct mode, shared/routing context, and matching system prompts - CompleteResponse_OnSessionComplete_FiresBeforeOnStateChanged: source-level regression guard ensuring event ordering invariant - Fix preset count assertion (4 -> 5) in Scenario_CreateGroupFromPreset Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…p rename Worker instructions now include: - Check CI status (gh pr checks) and distinguish PR-specific vs infra failures - Check existing review comments to avoid duplicate feedback - Verify PR claims against actual source code and git history - Check for cross-platform build errors hidden in #if blocks Team creation UX: - Name input field in preset picker (optional, defaults to preset name) - CreateGroupFromPresetAsync accepts nameOverride parameter Group rename: - Inline rename via ✏️ Rename in group context menu (both repo and non-repo groups) - Enter to commit, Escape to cancel Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Two bugs caused PR Review Squad groups to disappear after relaunch: 1. CreateGroupFromPresetAsync (and CreateMultiAgentGroupAsync) flushed organization.json immediately but left active-sessions.json on the 2s debounce timer. If the process was killed before the timer fired, the squad sessions were missing from active-sessions.json on restart. 2. ReconcileOrganization (called during LoadOrganization before session restore) pruned any session not in knownNames — which includes active-sessions.json entries. Multi-agent sessions not yet in that file were stripped from org, orphaning the group. Fixes: - Flush active-sessions.json immediately after multi-agent group creation - Protect multi-agent group sessions from pruning in ReconcileOrganization (organization.json is the authoritative source for these sessions) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

KeepAliveSession.ShouldRender() returned false for hidden sessions when WarmWhenHidden was false. After CompleteResponse cleared IsProcessing and HandleComplete removed streamingBySession, WarmWhenHidden evaluated to false — so the slot suppressed all renders. The final message (in session.History) never painted until the user manually switched to that session. Fix: include completedSessions.Contains() in WarmWhenHidden so completed sessions get one warm render to paint the final message. The completedSessions set auto-clears after 10 seconds. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Three layered protections for multi-agent group persistence: 1. Remove ReconcileOrganization() call from LoadOrganization() — it ran with _sessions empty (before RestorePreviousSessionsAsync), and the orphan-fixing + pruning logic would damage org state. Reconciliation is already called explicitly after session restore (lines 403, 533). 2. Skip pruning when _sessions is empty — belt-and-suspenders guard in ReconcileOrganization itself. Even if called pre-restore by a future code path, it refuses to prune session metadata when there are zero active sessions in memory. 3. Multi-agent session protection (from prior commit) — sessions belonging to multi-agent groups are never pruned regardless of active-sessions.json state, since organization.json is their authoritative source. Updated tests: - Scenario_AppRestart_MultiAgentGroupSurvives: matches new startup sequence (no reconcile during load, reconcile after restore with sessions populated) - ReconcileOrganization_WithZeroActiveSessions_DoesNotPrune: verifies the empty-sessions guard prevents metadata destruction - Added AddDummySessions helper to simulate restored sessions in _sessions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

GetOrCreateRepoGroup matched on RepoId without excluding multi-agent groups. When a multi-agent group (e.g. PR Review Squad) had a RepoId, regular sessions auto-linked to the same repo were assigned into the multi-agent group, corrupting the sidebar layout. Fix: skip IsMultiAgent groups in both GetOrCreateRepoGroup and the repo-group existence check in ReconcileOrganization, so a proper non-multi-agent repo group is created instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Multi-model review (Opus, Sonnet 4.5, GPT-5.1-Codex) identified several remaining data loss vectors: 1. Sync CreateMultiAgentGroup only used debounced SaveOrganization — now calls FlushSaveOrganization() immediately. 2. DeleteGroup used debounced save for both org and active-sessions — now flushes both immediately to prevent deleted squads from resurrecting on restart. 3. WriteOrgFile and WriteActiveSessionsFile used File.WriteAllText directly — now use atomic write-to-temp-then-rename pattern to prevent corruption on crash mid-write. 4. Zero-session safety guard in ReconcileOrganization was positioned after repo-group additions could modify state — moved to top of method as a full early-return before any mutations. 5. Scenario_MixedSessions_ReconcileDoesNotScatter was vacuous (Opus finding) — test never called AddDummySessions, so the zero-session guard returned early and reconciliation logic was never exercised. Added 5 new regression tests: - GetOrCreateRepoGroup_SkipsMultipleMultiAgentGroups - CreateMultiAgentGroup_FlushesOrganizationImmediately - DeleteMultiAgentGroup_RemovesAllSessionsAndGroup - ReconcileOrganization_CalledTwice_NoCorruption - All ReconcileOrganization tests now properly populate _sessions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The ma-expanded-toolbar (position:absolute, top:0) and keep-alive-slot (position:absolute, top:0) both started at the same position, causing the group breadcrumb bar to overlap the session chat header. Fix: Use CSS :has() selector in global app.css to offset keep-alive slots by the toolbar height (41px) when the toolbar is present. The rule lives in app.css (not scoped CSS) because the toolbar is rendered by Dashboard.razor while keep-alive-slot is rendered by the child KeepAliveSession component with a different scope attribute. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

In Orchestrate and OrchestratorReflect modes, the orchestrator handles all dispatching. Individual worker session cards should not show their own input fields. Added HideInput parameter to SessionCard and set it when the parent group is in Orchestrator or OrchestratorReflect mode. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Tests were writing to the real organization.json, active-sessions.json, and other files in ~/.polypilot/ because CopilotService uses static lazy paths. Added SetBaseDirForTesting() to redirect all file I/O to a temp directory, called via ModuleInitializer before any tests run. This was the root cause of squad groups being destroyed after running tests — the test's org state (with TestGroup) overwrote the user's real organization. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Added 4 guard tests in TestIsolationGuardTests.cs that verify tests never touch the real ~/.polypilot/ directory: - BaseDir_IsNotRealPolypilotDir - BaseDir_PointsToTempDirectory - TestSetup_ModuleInitializer_HasRun - CreateGroup_DoesNotTouchRealOrgFile Updated copilot-instructions.md with detailed test safety section explaining the isolation mechanism and why it exists. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ExpandedSessionView (the full chat view shown on mobile and desktop expanded mode) always rendered the input-area regardless of orchestration mode. Added HideInput parameter to hide the entire input area for worker sessions in Orchestrator/OrchestratorReflect modes, matching the existing card view behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Remove incorrect HideInput parameter from SessionCard and ExpandedSessionView — per-session inputs must always be visible. Only group-level broadcast bars are hidden for Orchestrate/Reflect. Add UI Behavior section to multi-agent-orchestration.md documenting input visibility rules, per-mode element visibility, and message routing for orchestrator sessions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Create .claude/skills/android-wifi-deploy/SKILL.md with full playbook for WiFi ADB deployment on macOS: proxy setup, port discovery, Fast Deployment gotchas, critical invariants, and recovery procedures. Add warning pointer in copilot-instructions.md Android section to ensure agents see the guardrails before attempting WiFi deploy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- DeletePersistedSession: use atomic write (tmp+rename) matching WriteActiveSessionsFile pattern - ReconcileOrganization: scope zero-session guard to pre-init window when all sessions are closed. - Dispatch logging: remove plan response content from [DISPATCH] log to avoid leaking AI-generated content to stdout - Add 2 regression tests for zero-session guard behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- SetBaseDirForTesting: clear _copilotBaseDir and _sessionStatePath so tests don't read from real ~/.copilot/session-state/ - RenameGroup: add FlushSaveOrganization() for consistency with other structural mutations (CreateMultiAgentGroup, DeleteGroup) - CreateGroupFromPresetAsync: add uniqueness loop for orchestrator and worker session names, matching CreateMultiAgentGroupAsync Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Read input value via getElementValue before setting renamingGroupId=null, preventing a re-render from removing the input element before the async JS call completes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen and others added 25 commits February 24, 2026 08:14

Pin orchestrator session on multi-agent group creation

3b56e00

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen force-pushed the reviewing_prs branch from 1ee61d5 to 7b5cc65 Compare February 24, 2026 14:19

PureWeen and others added 2 commits February 24, 2026 08:26

PureWeen merged commit 30a8f1f into main Feb 24, 2026

PureWeen deleted the reviewing_prs branch February 24, 2026 15:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Multi-agent PR review: render stall fix, squad persistence, team naming & rename#203

Multi-agent PR review: render stall fix, squad persistence, team naming & rename#203
PureWeen merged 27 commits intomainfrom
reviewing_prs

PureWeen commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

PureWeen commented Feb 23, 2026

Summary

Render Stall Fix

Squad Group Persistence

PR Review Squad (Built-in Preset)

Team Creation UX

Input Bar Cleanup

Tests

Files Changed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant