Multi-agent PR review: render stall fix, squad persistence, team naming & rename#203
Merged
Multi-agent PR review: render stall fix, squad persistence, team naming & rename#203
Conversation
When a session completes, SafeRefreshAsync was doing a JS interop round-trip to capture input drafts before every render. With many sessions active (multi-agent reviews), this delayed the final message from appearing for minutes — the JS await blocked the render pipeline while other sessions kept firing events and resetting the timer. Now skips the JS draft capture when completedSessions is non-empty, since the render is for showing final messages, not capturing input. The completedSessions set is cleared after 10 seconds, so normal draft capture resumes for subsequent renders. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Typing in the orchestrator session's chat box was going through SendPromptAsync (direct chat) instead of SendToMultiAgentGroupAsync. The orchestrator LLM would respond with @worker: assignments but no code parsed or dispatched them — workers never received tasks. Now SendFromCard detects if the session is an orchestrator in an Orchestrator/OrchestratorReflect group and routes through the full pipeline: plan → parse @worker: → dispatch → collect → synthesize. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The group input textarea and Send All button are now hidden when the mode is Orchestrator or OrchestratorReflect since users should type directly in the orchestrator session's chat (which now routes through the pipeline). The input is still shown for Broadcast and Sequential modes where there's no orchestrator session. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The render pipeline was stalling ('stuck on thinking') during multi-agent
sessions because SafeRefreshAsync's JS interop eval was running on every
50ms render timer tick. With 10+ active sessions, the constant eval calls
flooded the WebView JS bridge, delaying Blazor render diffs for minutes.
Two fixes:
1. Debounce JS draft capture (3s minimum interval) — the eval that scrapes
all input values from the DOM only needs to run periodically, not on
every render cycle. Most render cycles just need StateHasChanged().
2. Fire OnSessionComplete BEFORE OnStateChanged in CompleteResponse — this
ensures HandleComplete populates completedSessions before RefreshState
checks the throttle. Previously the throttle often rejected completion
renders because completedSessions was still empty.
3. Revert INV-8 violation — removed the InvokeAsync+Task.Yield workaround
from HandleComplete that deferred renders (violating the processing-state
safety invariant).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CreateGroupFromPresetAsync called SaveOrganization() which uses a 2s debounce timer. If the process is killed before the timer fires (e.g., relaunch, crash, or SIGTERM), the group is lost and all worker sessions become orphans reassigned to the default group on next startup. Now calls FlushSaveOrganization() immediately after the debounced save to ensure multi-agent group structure survives process termination. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Long-running sessions with heavy tool use hit the model's context window limit, causing 400 Bad Request errors after 5 retries. The Copilot CLI handles this via its compaction mechanism. The SDK exposes this as SessionConfig.InfiniteSessions with an InfiniteSessionConfig type. Setting Enabled=true triggers automatic background compaction when the context approaches the limit, and buffer-exhaustion compaction when it's nearly full. The SDK already emits SessionCompactionStartEvent and SessionCompactionCompleteEvent which PolyPilot handles as timeline-only events — no additional event handling needed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Compaction works without this flag (this session compacted twice at ~135k tokens before the flag was ever set). The InfiniteSessions.Enabled flag may actually mean 'infinite context / no compaction' rather than 'enable compaction', which would be counterproductive. Reverting until the SDK semantics are clarified. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The multi-agent-input-bar div was always rendered for multi-agent groups. For Orchestrator mode it was empty (taking up space), and for OrchestratorReflect it only showed the iterations control. Now: Orchestrator mode renders no bar at all, OrchestratorReflect renders just the iterations control, and Broadcast/Sequential modes still get the textarea + Send All button. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5 workers with multi-model consensus review. Each worker dispatches 5 sub-agents (Opus, Opus, Sonnet, Gemini, Codex) and applies a 2+ model consensus filter. Orchestrator assigns PRs round-robin and produces a summary table with verdicts. Available in the preset picker for any worktree without needing a .squad/ directory. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ing guard - CreateGroupFromPresetAsync_PinsOrchestratorSession: verifies orchestrator is pinned and workers are not after multi-agent group creation - BuiltInPresets_IncludePRReviewSquad: validates PR Review Squad preset has 5 workers, correct mode, shared/routing context, and matching system prompts - CompleteResponse_OnSessionComplete_FiresBeforeOnStateChanged: source-level regression guard ensuring event ordering invariant - Fix preset count assertion (4 -> 5) in Scenario_CreateGroupFromPreset Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…p rename Worker instructions now include: - Check CI status (gh pr checks) and distinguish PR-specific vs infra failures - Check existing review comments to avoid duplicate feedback - Verify PR claims against actual source code and git history - Check for cross-platform build errors hidden in #if blocks Team creation UX: - Name input field in preset picker (optional, defaults to preset name) - CreateGroupFromPresetAsync accepts nameOverride parameter Group rename: - Inline rename via ✏️ Rename in group context menu (both repo and non-repo groups) - Enter to commit, Escape to cancel Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two bugs caused PR Review Squad groups to disappear after relaunch: 1. CreateGroupFromPresetAsync (and CreateMultiAgentGroupAsync) flushed organization.json immediately but left active-sessions.json on the 2s debounce timer. If the process was killed before the timer fired, the squad sessions were missing from active-sessions.json on restart. 2. ReconcileOrganization (called during LoadOrganization before session restore) pruned any session not in knownNames — which includes active-sessions.json entries. Multi-agent sessions not yet in that file were stripped from org, orphaning the group. Fixes: - Flush active-sessions.json immediately after multi-agent group creation - Protect multi-agent group sessions from pruning in ReconcileOrganization (organization.json is the authoritative source for these sessions) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
KeepAliveSession.ShouldRender() returned false for hidden sessions when WarmWhenHidden was false. After CompleteResponse cleared IsProcessing and HandleComplete removed streamingBySession, WarmWhenHidden evaluated to false — so the slot suppressed all renders. The final message (in session.History) never painted until the user manually switched to that session. Fix: include completedSessions.Contains() in WarmWhenHidden so completed sessions get one warm render to paint the final message. The completedSessions set auto-clears after 10 seconds. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three layered protections for multi-agent group persistence: 1. Remove ReconcileOrganization() call from LoadOrganization() — it ran with _sessions empty (before RestorePreviousSessionsAsync), and the orphan-fixing + pruning logic would damage org state. Reconciliation is already called explicitly after session restore (lines 403, 533). 2. Skip pruning when _sessions is empty — belt-and-suspenders guard in ReconcileOrganization itself. Even if called pre-restore by a future code path, it refuses to prune session metadata when there are zero active sessions in memory. 3. Multi-agent session protection (from prior commit) — sessions belonging to multi-agent groups are never pruned regardless of active-sessions.json state, since organization.json is their authoritative source. Updated tests: - Scenario_AppRestart_MultiAgentGroupSurvives: matches new startup sequence (no reconcile during load, reconcile after restore with sessions populated) - ReconcileOrganization_WithZeroActiveSessions_DoesNotPrune: verifies the empty-sessions guard prevents metadata destruction - Added AddDummySessions helper to simulate restored sessions in _sessions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GetOrCreateRepoGroup matched on RepoId without excluding multi-agent groups. When a multi-agent group (e.g. PR Review Squad) had a RepoId, regular sessions auto-linked to the same repo were assigned into the multi-agent group, corrupting the sidebar layout. Fix: skip IsMultiAgent groups in both GetOrCreateRepoGroup and the repo-group existence check in ReconcileOrganization, so a proper non-multi-agent repo group is created instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Multi-model review (Opus, Sonnet 4.5, GPT-5.1-Codex) identified several remaining data loss vectors: 1. Sync CreateMultiAgentGroup only used debounced SaveOrganization — now calls FlushSaveOrganization() immediately. 2. DeleteGroup used debounced save for both org and active-sessions — now flushes both immediately to prevent deleted squads from resurrecting on restart. 3. WriteOrgFile and WriteActiveSessionsFile used File.WriteAllText directly — now use atomic write-to-temp-then-rename pattern to prevent corruption on crash mid-write. 4. Zero-session safety guard in ReconcileOrganization was positioned after repo-group additions could modify state — moved to top of method as a full early-return before any mutations. 5. Scenario_MixedSessions_ReconcileDoesNotScatter was vacuous (Opus finding) — test never called AddDummySessions, so the zero-session guard returned early and reconciliation logic was never exercised. Added 5 new regression tests: - GetOrCreateRepoGroup_SkipsMultipleMultiAgentGroups - CreateMultiAgentGroup_FlushesOrganizationImmediately - DeleteMultiAgentGroup_RemovesAllSessionsAndGroup - ReconcileOrganization_CalledTwice_NoCorruption - All ReconcileOrganization tests now properly populate _sessions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ma-expanded-toolbar (position:absolute, top:0) and keep-alive-slot (position:absolute, top:0) both started at the same position, causing the group breadcrumb bar to overlap the session chat header. Fix: Use CSS :has() selector in global app.css to offset keep-alive slots by the toolbar height (41px) when the toolbar is present. The rule lives in app.css (not scoped CSS) because the toolbar is rendered by Dashboard.razor while keep-alive-slot is rendered by the child KeepAliveSession component with a different scope attribute. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In Orchestrate and OrchestratorReflect modes, the orchestrator handles all dispatching. Individual worker session cards should not show their own input fields. Added HideInput parameter to SessionCard and set it when the parent group is in Orchestrator or OrchestratorReflect mode. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tests were writing to the real organization.json, active-sessions.json, and other files in ~/.polypilot/ because CopilotService uses static lazy paths. Added SetBaseDirForTesting() to redirect all file I/O to a temp directory, called via ModuleInitializer before any tests run. This was the root cause of squad groups being destroyed after running tests — the test's org state (with TestGroup) overwrote the user's real organization. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Added 4 guard tests in TestIsolationGuardTests.cs that verify tests never touch the real ~/.polypilot/ directory: - BaseDir_IsNotRealPolypilotDir - BaseDir_PointsToTempDirectory - TestSetup_ModuleInitializer_HasRun - CreateGroup_DoesNotTouchRealOrgFile Updated copilot-instructions.md with detailed test safety section explaining the isolation mechanism and why it exists. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ExpandedSessionView (the full chat view shown on mobile and desktop expanded mode) always rendered the input-area regardless of orchestration mode. Added HideInput parameter to hide the entire input area for worker sessions in Orchestrator/OrchestratorReflect modes, matching the existing card view behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove incorrect HideInput parameter from SessionCard and ExpandedSessionView — per-session inputs must always be visible. Only group-level broadcast bars are hidden for Orchestrate/Reflect. Add UI Behavior section to multi-agent-orchestration.md documenting input visibility rules, per-mode element visibility, and message routing for orchestrator sessions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create .claude/skills/android-wifi-deploy/SKILL.md with full playbook for WiFi ADB deployment on macOS: proxy setup, port discovery, Fast Deployment gotchas, critical invariants, and recovery procedures. Add warning pointer in copilot-instructions.md Android section to ensure agents see the guardrails before attempting WiFi deploy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- DeletePersistedSession: use atomic write (tmp+rename) matching WriteActiveSessionsFile pattern - ReconcileOrganization: scope zero-session guard to pre-init window when all sessions are closed. - Dispatch logging: remove plan response content from [DISPATCH] log to avoid leaking AI-generated content to stdout - Add 2 regression tests for zero-session guard behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1ee61d5 to
7b5cc65
Compare
- SetBaseDirForTesting: clear _copilotBaseDir and _sessionStatePath so tests don't read from real ~/.copilot/session-state/ - RenameGroup: add FlushSaveOrganization() for consistency with other structural mutations (CreateMultiAgentGroup, DeleteGroup) - CreateGroupFromPresetAsync: add uniqueness loop for orchestrator and worker session names, matching CreateMultiAgentGroupAsync Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Read input value via getElementValue before setting renamingGroupId=null, preventing a re-render from removing the input element before the async JS call completes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes render stalls during multi-agent reviews, adds PR Review Squad as a built-in preset, and improves team creation UX.
Render Stall Fix
SafeRefreshAsyncwas callingJS.InvokeAsync("eval")every 50ms, flooding the WebView bridge during 10+ active sessions. Now debounced to every 3 seconds via_lastDraftCaptureTickMsCompleteResponsenow firesOnSessionCompletebeforeOnStateChanged, ensuringHandleCompletepopulatescompletedSessionsbefore the throttle checkSquad Group Persistence
CreateGroupFromPresetAsyncnow callsFlushSaveOrganization()immediately after the debounced save — prevents group loss when app is killed before the 2s timer firesPR Review Squad (Built-in Preset)
Team Creation UX
Input Bar Cleanup
Tests
CreateGroupFromPresetAsync_PinsOrchestratorSessionBuiltInPresets_IncludePRReviewSquadCompleteResponse_OnSessionComplete_FiresBeforeOnStateChanged(source-level regression guard)Files Changed
Dashboard.razor— debounce, event ordering, input bar hideCopilotService.Events.cs— completion event reorderingCopilotService.Organization.cs— flush, pin, nameOverrideSessionSidebar.razor+ CSS — input bar hide, name input, group renameModelCapabilities.cs— PR Review Squad preset, WorkerReviewPromptSessionOrganizationTests.cs— new testsRenderThrottleTests.cs— event ordering guard