Skip to content

Comments

Multi-agent PR review: render stall fix, squad persistence, team naming & rename#203

Merged
PureWeen merged 27 commits intomainfrom
reviewing_prs
Feb 24, 2026
Merged

Multi-agent PR review: render stall fix, squad persistence, team naming & rename#203
PureWeen merged 27 commits intomainfrom
reviewing_prs

Conversation

@PureWeen
Copy link
Owner

Summary

Fixes render stalls during multi-agent reviews, adds PR Review Squad as a built-in preset, and improves team creation UX.

Render Stall Fix

  • Debounced JS draft captureSafeRefreshAsync was calling JS.InvokeAsync("eval") every 50ms, flooding the WebView bridge during 10+ active sessions. Now debounced to every 3 seconds via _lastDraftCaptureTickMs
  • Event reorderingCompleteResponse now fires OnSessionComplete before OnStateChanged, ensuring HandleComplete populates completedSessions before the throttle check

Squad Group Persistence

  • CreateGroupFromPresetAsync now calls FlushSaveOrganization() immediately after the debounced save — prevents group loss when app is killed before the 2s timer fires

PR Review Squad (Built-in Preset)

  • 5 workers with multi-model consensus (2+ models must agree)
  • Worker instructions include: CI status checking, existing review comment awareness, source code verification, git history tracing
  • SharedContext enforces review standards, RoutingContext handles PR round-robin

Team Creation UX

  • Name input — optional name field in preset picker (defaults to preset name)
  • Orchestrator auto-pin — orchestrator session pinned on creation, sorts to top of group
  • Group rename — inline rename via ✏️ menu item on any group

Input Bar Cleanup

  • Hidden for Orchestrator and OrchestratorReflect modes in both Dashboard and Sidebar

Tests

  • CreateGroupFromPresetAsync_PinsOrchestratorSession
  • BuiltInPresets_IncludePRReviewSquad
  • CompleteResponse_OnSessionComplete_FiresBeforeOnStateChanged (source-level regression guard)
  • All 1206 tests pass

Files Changed

  • Dashboard.razor — debounce, event ordering, input bar hide
  • CopilotService.Events.cs — completion event reordering
  • CopilotService.Organization.cs — flush, pin, nameOverride
  • SessionSidebar.razor + CSS — input bar hide, name input, group rename
  • ModelCapabilities.cs — PR Review Squad preset, WorkerReviewPrompt
  • SessionOrganizationTests.cs — new tests
  • RenderThrottleTests.cs — event ordering guard

PureWeen and others added 25 commits February 24, 2026 08:14
When a session completes, SafeRefreshAsync was doing a JS interop
round-trip to capture input drafts before every render. With many
sessions active (multi-agent reviews), this delayed the final message
from appearing for minutes — the JS await blocked the render pipeline
while other sessions kept firing events and resetting the timer.

Now skips the JS draft capture when completedSessions is non-empty,
since the render is for showing final messages, not capturing input.
The completedSessions set is cleared after 10 seconds, so normal
draft capture resumes for subsequent renders.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Typing in the orchestrator session's chat box was going through
SendPromptAsync (direct chat) instead of SendToMultiAgentGroupAsync.
The orchestrator LLM would respond with @worker: assignments but
no code parsed or dispatched them — workers never received tasks.

Now SendFromCard detects if the session is an orchestrator in an
Orchestrator/OrchestratorReflect group and routes through the full
pipeline: plan → parse @worker: → dispatch → collect → synthesize.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The group input textarea and Send All button are now hidden when
the mode is Orchestrator or OrchestratorReflect since users should
type directly in the orchestrator session's chat (which now routes
through the pipeline). The input is still shown for Broadcast and
Sequential modes where there's no orchestrator session.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The render pipeline was stalling ('stuck on thinking') during multi-agent
sessions because SafeRefreshAsync's JS interop eval was running on every
50ms render timer tick. With 10+ active sessions, the constant eval calls
flooded the WebView JS bridge, delaying Blazor render diffs for minutes.

Two fixes:

1. Debounce JS draft capture (3s minimum interval) — the eval that scrapes
   all input values from the DOM only needs to run periodically, not on
   every render cycle. Most render cycles just need StateHasChanged().

2. Fire OnSessionComplete BEFORE OnStateChanged in CompleteResponse — this
   ensures HandleComplete populates completedSessions before RefreshState
   checks the throttle. Previously the throttle often rejected completion
   renders because completedSessions was still empty.

3. Revert INV-8 violation — removed the InvokeAsync+Task.Yield workaround
   from HandleComplete that deferred renders (violating the processing-state
   safety invariant).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CreateGroupFromPresetAsync called SaveOrganization() which uses a 2s
debounce timer. If the process is killed before the timer fires (e.g.,
relaunch, crash, or SIGTERM), the group is lost and all worker sessions
become orphans reassigned to the default group on next startup.

Now calls FlushSaveOrganization() immediately after the debounced save
to ensure multi-agent group structure survives process termination.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Long-running sessions with heavy tool use hit the model's context window
limit, causing 400 Bad Request errors after 5 retries. The Copilot CLI
handles this via its compaction mechanism.

The SDK exposes this as SessionConfig.InfiniteSessions with an
InfiniteSessionConfig type. Setting Enabled=true triggers automatic
background compaction when the context approaches the limit, and
buffer-exhaustion compaction when it's nearly full.

The SDK already emits SessionCompactionStartEvent and
SessionCompactionCompleteEvent which PolyPilot handles as timeline-only
events — no additional event handling needed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Compaction works without this flag (this session compacted twice at ~135k
tokens before the flag was ever set). The InfiniteSessions.Enabled flag
may actually mean 'infinite context / no compaction' rather than 'enable
compaction', which would be counterproductive. Reverting until the SDK
semantics are clarified.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The multi-agent-input-bar div was always rendered for multi-agent groups.
For Orchestrator mode it was empty (taking up space), and for
OrchestratorReflect it only showed the iterations control.

Now: Orchestrator mode renders no bar at all, OrchestratorReflect
renders just the iterations control, and Broadcast/Sequential modes
still get the textarea + Send All button.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5 workers with multi-model consensus review. Each worker dispatches
5 sub-agents (Opus, Opus, Sonnet, Gemini, Codex) and applies a 2+
model consensus filter. Orchestrator assigns PRs round-robin and
produces a summary table with verdicts.

Available in the preset picker for any worktree without needing a
.squad/ directory.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ing guard

- CreateGroupFromPresetAsync_PinsOrchestratorSession: verifies orchestrator is
  pinned and workers are not after multi-agent group creation
- BuiltInPresets_IncludePRReviewSquad: validates PR Review Squad preset has 5
  workers, correct mode, shared/routing context, and matching system prompts
- CompleteResponse_OnSessionComplete_FiresBeforeOnStateChanged: source-level
  regression guard ensuring event ordering invariant
- Fix preset count assertion (4 -> 5) in Scenario_CreateGroupFromPreset

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…p rename

Worker instructions now include:
- Check CI status (gh pr checks) and distinguish PR-specific vs infra failures
- Check existing review comments to avoid duplicate feedback
- Verify PR claims against actual source code and git history
- Check for cross-platform build errors hidden in #if blocks

Team creation UX:
- Name input field in preset picker (optional, defaults to preset name)
- CreateGroupFromPresetAsync accepts nameOverride parameter

Group rename:
- Inline rename via ✏️ Rename in group context menu (both repo and non-repo groups)
- Enter to commit, Escape to cancel

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two bugs caused PR Review Squad groups to disappear after relaunch:

1. CreateGroupFromPresetAsync (and CreateMultiAgentGroupAsync) flushed
   organization.json immediately but left active-sessions.json on the
   2s debounce timer. If the process was killed before the timer fired,
   the squad sessions were missing from active-sessions.json on restart.

2. ReconcileOrganization (called during LoadOrganization before session
   restore) pruned any session not in knownNames — which includes
   active-sessions.json entries. Multi-agent sessions not yet in that
   file were stripped from org, orphaning the group.

Fixes:
- Flush active-sessions.json immediately after multi-agent group creation
- Protect multi-agent group sessions from pruning in ReconcileOrganization
  (organization.json is the authoritative source for these sessions)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
KeepAliveSession.ShouldRender() returned false for hidden sessions
when WarmWhenHidden was false. After CompleteResponse cleared
IsProcessing and HandleComplete removed streamingBySession,
WarmWhenHidden evaluated to false — so the slot suppressed all
renders. The final message (in session.History) never painted until
the user manually switched to that session.

Fix: include completedSessions.Contains() in WarmWhenHidden so
completed sessions get one warm render to paint the final message.
The completedSessions set auto-clears after 10 seconds.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three layered protections for multi-agent group persistence:

1. Remove ReconcileOrganization() call from LoadOrganization() — it ran
   with _sessions empty (before RestorePreviousSessionsAsync), and the
   orphan-fixing + pruning logic would damage org state. Reconciliation
   is already called explicitly after session restore (lines 403, 533).

2. Skip pruning when _sessions is empty — belt-and-suspenders guard in
   ReconcileOrganization itself. Even if called pre-restore by a future
   code path, it refuses to prune session metadata when there are zero
   active sessions in memory.

3. Multi-agent session protection (from prior commit) — sessions belonging
   to multi-agent groups are never pruned regardless of active-sessions.json
   state, since organization.json is their authoritative source.

Updated tests:
- Scenario_AppRestart_MultiAgentGroupSurvives: matches new startup sequence
  (no reconcile during load, reconcile after restore with sessions populated)
- ReconcileOrganization_WithZeroActiveSessions_DoesNotPrune: verifies the
  empty-sessions guard prevents metadata destruction
- Added AddDummySessions helper to simulate restored sessions in _sessions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GetOrCreateRepoGroup matched on RepoId without excluding multi-agent
groups. When a multi-agent group (e.g. PR Review Squad) had a RepoId,
regular sessions auto-linked to the same repo were assigned into the
multi-agent group, corrupting the sidebar layout.

Fix: skip IsMultiAgent groups in both GetOrCreateRepoGroup and the
repo-group existence check in ReconcileOrganization, so a proper
non-multi-agent repo group is created instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Multi-model review (Opus, Sonnet 4.5, GPT-5.1-Codex) identified several
remaining data loss vectors:

1. Sync CreateMultiAgentGroup only used debounced SaveOrganization —
   now calls FlushSaveOrganization() immediately.

2. DeleteGroup used debounced save for both org and active-sessions —
   now flushes both immediately to prevent deleted squads from
   resurrecting on restart.

3. WriteOrgFile and WriteActiveSessionsFile used File.WriteAllText
   directly — now use atomic write-to-temp-then-rename pattern to
   prevent corruption on crash mid-write.

4. Zero-session safety guard in ReconcileOrganization was positioned
   after repo-group additions could modify state — moved to top of
   method as a full early-return before any mutations.

5. Scenario_MixedSessions_ReconcileDoesNotScatter was vacuous (Opus
   finding) — test never called AddDummySessions, so the zero-session
   guard returned early and reconciliation logic was never exercised.

Added 5 new regression tests:
- GetOrCreateRepoGroup_SkipsMultipleMultiAgentGroups
- CreateMultiAgentGroup_FlushesOrganizationImmediately
- DeleteMultiAgentGroup_RemovesAllSessionsAndGroup
- ReconcileOrganization_CalledTwice_NoCorruption
- All ReconcileOrganization tests now properly populate _sessions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ma-expanded-toolbar (position:absolute, top:0) and keep-alive-slot
(position:absolute, top:0) both started at the same position, causing
the group breadcrumb bar to overlap the session chat header.

Fix: Use CSS :has() selector in global app.css to offset keep-alive
slots by the toolbar height (41px) when the toolbar is present. The
rule lives in app.css (not scoped CSS) because the toolbar is rendered
by Dashboard.razor while keep-alive-slot is rendered by the child
KeepAliveSession component with a different scope attribute.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In Orchestrate and OrchestratorReflect modes, the orchestrator handles
all dispatching. Individual worker session cards should not show their
own input fields. Added HideInput parameter to SessionCard and set it
when the parent group is in Orchestrator or OrchestratorReflect mode.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tests were writing to the real organization.json, active-sessions.json,
and other files in ~/.polypilot/ because CopilotService uses static
lazy paths. Added SetBaseDirForTesting() to redirect all file I/O to a
temp directory, called via ModuleInitializer before any tests run.

This was the root cause of squad groups being destroyed after running
tests — the test's org state (with TestGroup) overwrote the user's
real organization.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Added 4 guard tests in TestIsolationGuardTests.cs that verify tests
never touch the real ~/.polypilot/ directory:
- BaseDir_IsNotRealPolypilotDir
- BaseDir_PointsToTempDirectory
- TestSetup_ModuleInitializer_HasRun
- CreateGroup_DoesNotTouchRealOrgFile

Updated copilot-instructions.md with detailed test safety section
explaining the isolation mechanism and why it exists.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ExpandedSessionView (the full chat view shown on mobile and desktop
expanded mode) always rendered the input-area regardless of
orchestration mode. Added HideInput parameter to hide the entire
input area for worker sessions in Orchestrator/OrchestratorReflect
modes, matching the existing card view behavior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove incorrect HideInput parameter from SessionCard and
ExpandedSessionView — per-session inputs must always be visible.
Only group-level broadcast bars are hidden for Orchestrate/Reflect.

Add UI Behavior section to multi-agent-orchestration.md documenting
input visibility rules, per-mode element visibility, and message
routing for orchestrator sessions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create .claude/skills/android-wifi-deploy/SKILL.md with full playbook
for WiFi ADB deployment on macOS: proxy setup, port discovery, Fast
Deployment gotchas, critical invariants, and recovery procedures.

Add warning pointer in copilot-instructions.md Android section to
ensure agents see the guardrails before attempting WiFi deploy.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- DeletePersistedSession: use atomic write (tmp+rename) matching
  WriteActiveSessionsFile pattern
- ReconcileOrganization: scope zero-session guard to pre-init window
  when all sessions are closed.
- Dispatch logging: remove plan response content from [DISPATCH] log
  to avoid leaking AI-generated content to stdout
- Add 2 regression tests for zero-session guard behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen and others added 2 commits February 24, 2026 08:26
- SetBaseDirForTesting: clear _copilotBaseDir and _sessionStatePath
  so tests don't read from real ~/.copilot/session-state/
- RenameGroup: add FlushSaveOrganization() for consistency with
  other structural mutations (CreateMultiAgentGroup, DeleteGroup)
- CreateGroupFromPresetAsync: add uniqueness loop for orchestrator
  and worker session names, matching CreateMultiAgentGroupAsync

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Read input value via getElementValue before setting renamingGroupId=null,
preventing a re-render from removing the input element before the async
JS call completes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen merged commit 30a8f1f into main Feb 24, 2026
@PureWeen PureWeen deleted the reviewing_prs branch February 24, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant