Skip to content

feat: add terminal dashboard app with CLI launcher #686

@christso

Description

@christso

Objective

Create a keyboard-first local terminal dashboard for AgentV that lets users browse run history, recent summaries, and per-run details while keeping the user-facing install and launch experience simple.

Architecture Boundary

separate package, unified launcher

Prefer implementing the terminal dashboard as a dedicated app/package (for example apps/tui) rather than folding the full runtime into the existing CLI command package.

Reasoning:

  • cleaner ownership boundaries for UI-specific code
  • easier navigation for AI coding agents and humans
  • lower risk of mixing command-runner concerns with long-lived TUI runtime concerns
  • easier parallel development as the UI grows beyond a narrow prompt flow

Even with a separate package, the user-facing experience should stay unified:

  • npm install agentv
  • agentv <command>
  • no extra package install required for dashboard/TUI users

The CLI should remain the launcher surface. The separate package is an internal architecture choice, not a user-facing packaging burden.

Current Ground Truth

  • Current AgentV main has no dedicated TUI app/package yet.
  • The existing interactive flow is an Inquirer-based wizard for selecting evals/targets, not a full terminal dashboard.
  • Latest main now has agentv results summary, agentv results failures, and agentv results show, which are the best near-term artifact/query surface to reuse.
  • AgentV already ships agentv serve from the CLI package, which establishes the pattern that CLI can launch richer local UI surfaces.
  • The current browser review UI in agentv serve already implements a useful content model that the TUI should borrow rather than reinvent.

V1 Content Model

The first TUI version should explicitly mirror the current agentv serve results-review surface at a terminal-appropriate level.

View 1: Overview

Show aggregate run stats equivalent to the current browser review UI:

  • total tests
  • passed
  • failed
  • execution errors
  • pass rate
  • total duration
  • token usage
  • estimated cost

When multiple targets are present, also show a per-target summary table with:

  • target name
  • pass rate
  • passed / failed / errors
  • average score
  • duration
  • tokens
  • cost

Include a compact score distribution view if feasible in the terminal.

View 2: Test List

Show a filterable/sortable test list equivalent to the current browser review UI table.

Per row, include at minimum:

  • status
  • test id
  • target when relevant
  • overall score
  • evaluator columns or a compact evaluator summary
  • duration
  • cost

Support filtering by at least:

  • status
  • target when relevant
  • text search by test id

View 3: Test Detail

Selecting a test should open a detail view or detail pane with the same high-value review content already present in agentv serve:

  • input preview
  • output preview
  • evaluator score breakdown
  • passed/failed expectations or assertions
  • execution error details when relevant
  • lightweight metadata such as timing/target identifiers

For v1, this should be read-only review/debug content first.

Design Latitude

A dedicated apps/tui package is preferred for the first substantial implementation.

Prefer reusing existing AgentV result/history abstractions and output artifacts over inventing a new plugin system. If shared dashboard data/query logic is needed, keep that layer UI-agnostic so it can support browser and terminal surfaces.

The TUI should reuse the current agentv serve content logic where practical:

  • aggregate stats computation
  • per-target aggregation
  • per-test row shaping
  • evaluator score extraction
  • expectation/assertion rendering data

Renderer choice is intentionally open. If the TUI needs a dedicated renderer/runtime boundary, options such as OpenTUI or a schema-driven Ink renderer like @json-render/ink are both in bounds.

The same architectural direction likely applies to browser UI over time: agentv serve may remain the CLI entrypoint, but the browser dashboard/runtime should eventually be able to live in its own package (for example apps/wui) rather than being permanently owned by CLI internals.

Acceptance Signals

  • agentv exposes a keyboard-first terminal dashboard entrypoint.
  • The TUI implementation lives in a dedicated package/app or is clearly structured so it can be isolated without major churn.
  • The dashboard can read existing AgentV run artifacts or the same history storage used by dashboard/reporting features.
  • The TUI provides these core review surfaces:
    • overview stats
    • test list
    • per-test detail
  • The per-test detail includes evaluator breakdown plus failed expectations/assertions or execution-error information.
  • The UI is usable entirely in the terminal.
  • npm install agentv users do not need a second manual install step to use the dashboard.
  • Shared data-loading logic, if introduced, is reusable by other dashboard surfaces.

Non-Goals

  • Replacing or merging with the web dashboard work in feat: self-hosted dashboard — historical trends, dataset management, YAML editor #563.
  • Designing a general third-party plugin architecture for the dashboard.
  • Moving core evaluation logic into UI code.
  • Forcing users to install separate UI packages manually.
  • Full parity with the broader web dashboard roadmap on the first pass.
  • Historical trends, dataset browser, and live SSE parity in the first TUI version.
  • Making model-in-the-loop UI generation a requirement for the first version.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    tuiRelates to the terminal dashboard / terminal UI

    Type

    No type

    Projects

    Status

    Ready

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions