feat(mcp): rewrite dashboard authoring prompts; expose filters on save tool by alex-fedotyev · Pull Request #2264 · hyperdxio/hyperdx

alex-fedotyev · 2026-05-12T22:28:19Z

Summary

Drew shipped the schemas in #2273 (dashboard-level filters, per-tile onClick, drill-down navigation). This PR ships the prompts and examples that teach an MCP-using agent how to use those primitives correctly. Without it, an LLM building a HyperDX dashboard via MCP has the API but doesn't know when or how to call it.

What lands

create_dashboard design checklist. A ten-rule scannable preamble at the top of the prompt covers RED columns with aliases, per-series numberFormat for durations, groupByColumnsOnLeft for inventory tables, dashboard-level filters instead of per-tile where literals, one-metric-per-tile for metric sources, and containers and tabs for grouping. Replaces the wall-of-JSON canonical example in the old prompt so the model picks the right pattern in one read.

Four verified dashboard_examples patterns. service_inventory, service_detail, log_analytics, backend_dependencies. Each carries a "When to use" header and a "Why this shape" note. Each was built and rendered on a live dev stack with realistic synthetic data (90K traces across 8 services, 17K logs with severity distribution) before landing. That verification surfaced two design issues now captured as explicit rules: chart-level numberFormat on a table mixing counts and durations formats every value as a duration (per-series numberFormat is the fix), and metric tiles take exactly one select item because the renderer destructures select[0] and ignores the rest (multi-metric authoring needs one tile per metric).

query_guide additions. New sections for DASHBOARD FILTERS (canonical { type, name, expression, sourceId, where?, whereLanguage? } shape), NUMBER FORMAT (per-series vs chart-level distinction), and PER-TILE TYPE CONSTRAINTS (the metric one-select rule).

hyperdx_save_dashboard filters parameter. Widened the tool's input handling to accept both ExternalDashboardFilter and ExternalDashboardFilterWithId shapes via the same body schemas the v2 REST handler already uses (createDashboardBodySchema, updateDashboardBodySchema). MCP and REST surfaces stay in lockstep; the existing convertExternalFiltersToInternal helper handles the conversion without a translation layer.

service_inventory row-click wiring. Concrete worked example of how to compose Drew's onClick primitive: the Services table in the service_inventory example carries an onClick that drills into the partner Service Detail dashboard. Uses mode: "template" with the constant template "Service Detail" so the destination resolves by name at click time and there's no order-of-save dependency between the two dashboards.

Post-Claude-run prompt refinements. Three follow-up commits capture bugs the v1 prompts surfaced when Claude actually authored dashboards against them. The groupBy alias gap (LLM kept emitting SELECT toStartOfMinute(t) AS time without aliasing the groupBy column). A broader lucene-gotcha note. Inlined concrete examples for the patterns that were ambiguous on first pass.

Voice pass. Em-dashes dropped from buildSourceSummary and from the dashboard authoring prompt content. Snapshot test guards regressions.

Co-authorship and rebase

The branch composes Drew's MCP onClick schemas and prompts commit (78e72652) on top of his already-merged #2273. Drew authored that commit; the service_inventory row-click wiring on top is the concrete demonstration of the foundation he built. Drew's commit is preserved in the chain to keep attribution intact. Drew's chore: Remove IS_DASHBOARD_LINKING_ENABLED toggle dropped during rebase as already-upstream via #2273.

Schema design note

mcpDashboardFilterSchema and mcpFiltersParam use the inline-schema design from main (the version Drew's #2273 shipped). An earlier version of this branch defined mcpFiltersParam as z.array(externalDashboardFilterSchemaWithId.partial({ id: true })) to reuse the canonical REST schema, but that lost Drew's id-management description (concrete LLM guidance on filter-id handling for create vs update flows). After rebase, the refactor commit aligned back to main's design to preserve those semantics. The wider input type on saveDashboard.ts still ships, so REST/MCP type parity is unaffected.

Test plan

yarn workspace @hyperdx/api tsc --noEmit clean
yarn workspace @hyperdx/api lint clean (prettier + eslint + tsc + openapi lint)
prose-lint.py clean on every changed file
Built every example dashboard on a live dev stack and confirmed every tile renders with realistic synthetic data
Snapshot tests guard: design checklist present, four patterns listed, each example carries "When to use", zero em-dashes in any prompt string
Filter round-trip via MCP: save with two filters, get back, update with one renamed filter, fetch, assert shape
Filter rejection: save with bogus sourceId returns 4xx
Backward compat: save without filters returns filters: []
Green CI across lint, unit, integration, knip, e2e shards 1-4, ClickHouse Bundle Build, Vercel (handled on push)

changeset-bot · 2026-05-12T22:28:23Z

🦋 Changeset detected

Latest commit: 00d360d

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages

Name	Type
@hyperdx/api	Patch
@hyperdx/app	Patch
@hyperdx/otel-collector	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2026-05-12T22:28:24Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
hyperdx-oss	Ready	Preview, Comment	May 20, 2026 4:08pm

github-actions · 2026-05-12T22:28:50Z

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

Large diff: 1405 production lines changed (threshold: 1000)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats

Production files changed: 5
Production lines changed: 1405 (+ 916 in test files, excluded from tier calculation)
Branch: alex/HDX-mcp-prompts-overhaul
Author: alex-fedotyev

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

github-actions · 2026-05-12T22:31:11Z

PR Review

This PR is mostly prompt content + a new filters parameter on hyperdx_save_dashboard. The prompt rewrite is well-snapshot-tested, the filter round-trip path mirrors the REST v2 schemas, and the onClick validation tests survived intact. One concrete behavior concern below.

⚠️ assignFilterIds (saveDashboard.ts:126-140) silently mints a fresh ObjectId for any filter missing an id on UPDATE → contradicts the prompt's own "MUST carry an id" contract (content.ts:36-44, schemas.ts:641-650). If an LLM forgets to carry the id of an existing filter, it gets a new id and silently orphans its savedFilterValues. The REST PUT path rejects this; the MCP path absorbs it. Either fail loud when an unknown id is missing on a filter that doesn't already exist on the dashboard, or at least diff against existingFilterIds (already in scope at saveDashboard.ts:501-503) and only auto-mint when the body has more filters than persisted. Worth covering with a test of "update payload omits id on a filter that exists in persisted set" → expect either error or id preserved.
ℹ️ Minor: mcpDashboardFilterSchema.id is plain z.string().optional() with no objectIdSchema validation (schemas.ts:637-650), while sourceId on the same schema uses objectIdSchema. A malformed id passed on UPDATE flows straight through assignFilterIds (since length > 0) into Mongo. Low risk, but tightening this would catch typos before they persist.
ℹ️ drilldown_links example carries <TARGET_DASHBOARD_ID> as a placeholder string (content.ts:656). The example header tells the user to swap it, but an agent that copies verbatim will hit Could not find the following onClick dashboard IDs: <TARGET_DASHBOARD_ID> rather than a schema error — fine if that's the intended UX, since the message is actionable.

No security or critical correctness issues. Ship-ready modulo the filter-id auto-mint footgun.

github-actions · 2026-05-12T22:37:37Z

E2E Test Results

✅ All tests passed • 178 passed • 3 skipped • 1229s

Status	Count
✅ Passed	178
❌ Failed	0
⚠️ Flaky	4
⏭️ Skipped	3

Tests ran across 4 shards in parallel.

View full report →

github-actions · 2026-05-12T22:48:30Z

Deep Review

Scope: PR #2264 against base 205919f5 — MCP dashboard prompts overhaul + hyperdx_save_dashboard filter input widening (7 files, +1472/−864).

Reviewers ran: correctness, testing, maintainability, project-standards, agent-native, learnings, api-contract, kieran-typescript, adversarial.

✅ No critical issues found.

🟡 P2 — recommended

packages/api/src/mcp/tools/dashboards/saveDashboard.ts:126 — On update, assignFilterIds mints a fresh ObjectId for any filter missing an id; convertExternalFiltersToInternal then mints another (since the just-minted id is not in existingFilterIds), so every "forgot ids" update rotates every filter id. The schema warning at schemas.ts:637-650 claims this "orphans savedFilterValues," but savedFilterValues is keyed by {type, condition} not by filter id, so the documented harm cannot occur — the real harm (silent id churn that invalidates id-keyed UI state and the table-tile onClick filter→expression binding contract this PR ships) is not surfaced to the agent.
- Fix: Either reject update payloads that omit id on a filter the dashboard still has, or match by (expression, name, sourceId) against existingDashboard.filters to reuse the prior id; then rewrite the schema description to match the chosen behavior.
- _{correctness, api-contract, adversarial}
packages/api/src/mcp/tools/dashboards/schemas.ts:637 — The MCP filter schema's .describe() and the new prompt content (content.ts:38) teach the model to "preserve savedFilterValues bound to that id," but the hyperdx_save_dashboard inputSchema never exposes savedFilterValues (or savedQuery / savedQueryLanguage), and saveDashboard.ts:288 hard-codes resolveSavedQueryLanguage({savedQuery: undefined, savedQueryLanguage: undefined}); the agent is told to defend a field it cannot author. The pre-existing parity gap is unchanged, but this PR's prompt rewrite is the first place the gap becomes a documented promise.
- Fix: Either expose savedQuery, savedQueryLanguage, and savedFilterValues on the MCP inputSchema (plumbing them into $set on update) or scrub the prompt + schema descriptions of references to fields the tool cannot drive.
- _{agent-native, kieran-typescript}
packages/api/src/mcp/__tests__/dashboards.test.ts:1359 — The new normalization helpers stripFilterIds and assignFilterIds are exercised only through happy-path integration tests; the dangerous case (update payload that omits ids on previously-saved filters) is not covered, and the PR-description claim "save with bogus sourceId returns 4xx" has no test that drives a filter (vs. tile) sourceId. A regression that flipped the orphan-vs-preserve behavior would still pass.
- Fix: Add a save→update round-trip that resubmits both existing filters without ids and asserts the contract you chose (reject, or id-stable by expression), plus a test that saves with filters: [{ ..., sourceId: "<unknown>" }] and asserts isError: true.
- _{testing, correctness}

🔵 P3 nitpicks (8)

packages/api/src/mcp/tools/dashboards/saveDashboard.ts:137 — assignFilterIds performs an ObjectId allocation that convertExternalFiltersToInternal always discards on update, since the just-minted id never matches existingFilterIds; the helper exists only to satisfy updateDashboardBodySchema's id-required check.
- Fix: Collapse the two id-mint sites into one — either drop assignFilterIds and let convertExternalFiltersToInternal own id generation, or have convertExternalFiltersToInternal skip its mint when the id was already assigned at the boundary.
- _{correctness, api-contract}
packages/api/src/mcp/tools/dashboards/saveDashboard.ts:299 — On create, savedFilterValues: parsed.data.savedFilterValues is spread unconditionally, but the MCP inputSchema does not expose savedFilterValues, so this is always undefined; the update path correctly guards on !== undefined. Harmless today, but the inconsistency masks the missing-input bug if the schema later exposes the field.
- Fix: Spread savedFilterValues conditionally on create to match the update path.
- _correctness
packages/api/src/mcp/tools/dashboards/saveDashboard.ts:114 — inputFilters is typed (ExternalDashboardFilter | ExternalDashboardFilterWithId)[], but ExternalDashboardFilter is just ExternalDashboardFilterWithId with id omitted; the union forces four as casts inside the two helpers.
- Fix: Type the param as a single shape with id?: string, then destructure { id, ...rest } without casts.
- _{kieran-typescript}
packages/api/src/mcp/tools/dashboards/schemas.ts:635 — mcpDashboardFilterSchema re-declares every field of externalDashboardFilterSchemaWithId and is not .strict(), while the REST counterpart is; field additions in common-utils will silently drift between the two surfaces.
- Fix: Build the MCP shape from externalDashboardFilterSchemaWithId.partial({ id: true }) and layer per-field .describe() on top, or add a compile-time TypeEqual assert.
- _{maintainability, api-contract}
packages/api/src/mcp/tools/dashboards/saveDashboard.ts:457 — filterChangedHeatmapTiles intentionally skips the heatmap source-kind gate when a tile's sourceId and displayType are unchanged, so a heatmap whose source was externally narrowed from trace to something else persists across any MCP-driven update; pre-existing REST behavior, now reachable from the MCP surface.
- Fix: Either accept the trade-off and document it (test that pins the behavior), or run the source-kind gate against the persisted heatmap set on every update.
- _adversarial
packages/api/src/mcp/tools/dashboards/saveDashboard.ts:288 — Helper getSourceConnectionMismatches runs on the REST POST/PUT path but is not exported from utils/dashboards.ts, so the MCP save path cannot invoke it; a raw-SQL tile whose sourceId and connectionId belong to different connections will be persisted via MCP and rejected on the next REST round-trip.
- Fix: Export the helper and call it from both createDashboard and updateDashboard in saveDashboard.ts.
- _agent-native
packages/api/src/mcp/prompts/dashboards/content.ts:712 — The raw-SQL macro reference list ($__timeFilter, $__timeInterval, $__filters, …) appears in three places (schemas.ts:585, content.ts:712, content.ts:871); a future macro rename has to touch all three to avoid teaching the model an inconsistent macro set.
- Fix: Extract a single SQL_MACRO_REFERENCE constant and import it at all three sites.
- _{maintainability}
packages/api/src/mcp/__tests__/dashboards.test.ts:1414 — The new having field on table tiles has a single save-and-assert-from-response test; there is no get-dashboard round-trip and no .max(10000) rejection assertion, so a body-schema regression that drops having from persistence would pass.
- Fix: Add a save→get assertion on having and a rejection test for >10000-char input.
- _testing

Reviewers (9): correctness, testing, maintainability, project-standards, agent-native, learnings, api-contract, kieran-typescript, adversarial.

Testing gaps:

No direct unit tests for stripFilterIds / assignFilterIds (empty array, mixed-id, empty-string-id, undefined branches).
No regression test for the MCP-vs-REST parity gap on getSourceConnectionMismatches.
No assertion that save-without-filters returns filters: [] (the backward-compat claim in the PR body).

teeohhem · 2026-05-19T13:54:27Z

@alex-fedotyev I noticed some recent changes. Is this ready to review?

…ilters on save tool The `create_dashboard` prompt now leads with a ten-rule design checklist (RED columns with aliases, per-series numberFormat for durations, groupByColumnsOnLeft for inventory tables, dashboard-level filters instead of per-tile where literals, one-metric-per-tile for metric sources, containers and tabs for grouping). The wall-of-JSON canonical example is gone; the four dashboard_examples patterns carry the concrete shapes. The dashboard_examples set is replaced with four verified patterns (service_inventory, service_detail, log_analytics, backend_dependencies) plus the existing infrastructure_sql. Each non-SQL example leads with a "When to use" header and a "Why this shape" note so the model picks by intent, not by surface keyword match. Examples were built and rendered on a live dev stack before landing, which surfaced two design issues now captured as rules in the prompt: - chart-level numberFormat on a table that mixes counts and durations formats every value as a duration; per-series numberFormat is the fix. - metric tiles take exactly one select item (renderer destructures select[0] and ignores the rest). The query_guide prompt gains a DASHBOARD FILTERS section that documents the filters: [{ type, name, expression, sourceId, where?, whereLanguage? }] shape, a NUMBER FORMAT section that explains the per-series vs. chart-level distinction, and a PER-TILE TYPE CONSTRAINTS note about the metric one-select rule. hyperdx_save_dashboard now accepts `filters` on its input schema, reusing externalDashboardFilterSchemaWithId so the MCP and REST surfaces stay in lockstep and the existing convertExternalFiltersToInternal helper handles the conversion without translation. Filters round-trip through create, get, and update. Voice pass: every prompt string is now em-dash-free, with a snapshot test guarding regressions. Drill-down behavior (onClick from a service inventory row into the service detail dashboard) is a separate follow-up PR; landing the prompt rewrite first.

The buildSourceSummary helper is the source-list block that buildCreateDashboardPrompt prepends to the prompt body. It still carried em-dashes from before the voice pass, so the assembled `create_dashboard` prompt that the MCP transport returns to a client contained eight em-dashes even though every builder in content.ts is clean. Caught by an end-to-end check that fetched the prompt through the live MCP transport and counted em-dashes in the response payload.

…rompts claude-review flagged: - `service_detail` example uses `having: "StatusMessage != ''"` on a table tile, but mcpTableTileSchema.config does not include `having`. Zod's `.strip()` silently drops the field, so an LLM following the example through MCP would save a table that includes empty-message rows. Added `having: z.string().max(10000).optional()` mirroring externalDashboardTableChartConfigSchema and a round-trip test. - Heatmap tile's numberFormat describe still referenced `output: "time"` while the rest of the file uses `output: "duration"`. Aligned to "duration" for consistency with the prompt guidance. deep-review additionally flagged: - query_guide told the model that metric tiles take "1 select item with metricName + metricType", but mcpTileSelectItemSchema does not expose those fields, so the keys are silently stripped. Replaced the metric authoring guidance with an explicit "not currently exposed via the MCP schema; use raw SQL for infrastructure metrics" note. Same edit on the design-checklist rule 9 (replaced with a "replace, not merge" update-semantic note that addresses a separate review concern about partial-payload data loss on update). - mcpFiltersParam advertised id as optional via .partial({ id: true }), but createDashboardBodySchema rejects id (strict, no id) and updateDashboardBodySchema requires id. So an LLM copy-pasting a filter from get-dashboard into create, or adding a brand-new filter on update without an id, would hit a confusing strict-validation rejection. saveDashboard.ts now normalizes per flow: stripFilterIds() on create, assignFilterIds() on update. - Em-dash regression check did not cover buildQueryGuidePrompt (largest prompt, most likely to regress) or buildSourceSummary (helpers.ts carried em-dashes through the initial PR until fix `3539649e`). Added assertions for both. - Bad-sourceId test asserted only `text.toContain('source')`, which matches almost any error string. Tightened to assert "Could not find source IDs" plus the literal bad id. - Numbered-rule check used `prompt.toContain('${i}.')` which would match substrings like "1.2s" or "0.000000001". Anchored the check to line start (`/^${i}\\. /m`) and scoped it to the DESIGN CHECKLIST section. - DASHBOARD FILTERS and NUMBER FORMAT body content was not asserted beyond the heading. Now asserts substantive content (QUERY_EXPRESSION / expression / sourceId for filters; factor: 0.000000001 / duration / per-series for number format). - Filter round-trip did not cover the freshly-generated-id branch of convertExternalFiltersToInternal at the MCP layer. Added a test that ships a new no-id filter on update and asserts saveDashboard assigns a fresh id.

…oard The service_inventory pattern's Services table now carries an onClick that drills into the partner "Service Detail" dashboard. Using mode: "template" with the constant template "Service Detail" resolves the target by name at click time, so the canonical flow (save both dashboards in one shot) works without needing to thread the detail dashboard's ID back into the inventory tile. The onClick filter's expression "ServiceName" matches the service_detail filter declaration, so the destination dashboard's "Service" dropdown auto-populates with the clicked row's value rather than asking the user to re-select it. Snapshot test asserts the template target, filter expression, and template field shape so a future refactor cannot silently break the inventory -> detail drill-down link.

Worked Claude through five starter dashboards on a fresh stack; this pass folds the patterns it got wrong back into the prompts. DESIGN CHECKLIST changes: - Rule 2 (alias) now reads ALIAS EVERY SELECT ITEM, not ALIAS EVERY AGGREGATION. Claude landed three number tiles with no alias on the quantile() select item; rule 2 as written felt table-specific. - Rule 10 (containers) is now REQUIRED at five-plus tiles, not a soft hint. Claude built five dashboards averaging ten tiles each with zero containers when the rule was a SHOULD. - New rule 11 VALIDATE EVERY TILE AFTER SAVE, separated from the workflow step so it has its own anchor in the checklist. - New rule 12 NO TITLE-RECAP MARKDOWN TILE. Claude added a 24x2 "About this dashboard" markdown tile to every starter dashboard; the renderer styles markdown headings at title-bar scale so the tile ate a row of vertical space before the first KPI showed. TILE TYPE GUIDE: markdown line rewritten to actively discourage the title-recap habit (use containers/tabs for sectioning, skip the markdown tile entirely on starters, no headings inside markdown bodies if you must add one). QUERY GUIDE additions: - LUCENE FILTER SYNTAX gains a GOTCHA block: Lucene comparison operators and wildcards on dotted attribute paths (http.status_code:>=500, http.route:*) parse and save without error but fail silently at query time. Workaround: switch to SQL with bracket access. This was the cause of every "Error loading chart" Claude hit during its run. - PER-TILE TYPE CONSTRAINTS expanded for metric sources: builder tile authoring on metric sources currently renders as "Both table name and UUID are empty" at query time. Use a raw SQL tile with explicit table reference (otel_metrics_gauge / otel_metrics_sum / otel_metrics_histogram) until the builder catches up. Discovery note added because metric sources don't publish mapAttributeKeys the way log and trace sources do. - TABLE TILE LINKING gains a GROUPBY ALIASES AND ROW-CLICK TEMPLATES subsection. Builder tiles don't expose a groupBy alias today, so groupBy: SpanAttributes['http.route'] produces a column name no Handlebars template can reference as {{Route}}. Workaround: configType: "sql" with explicit AS Route. - COMMON MISTAKES rule 8 (validate-after-save) tightened to "every tile, not just one". Rule 13 (missing alias) and rule 14 (Lucene comparison on map paths) added with concrete wrong/correct examples. Rules 11/12 onClick entries were duplicate-numbered in the cherry-pick; renumbered to 15-18. - New REFERENCES section at the end of the query guide pins three ClickStack / ClickHouse doc URLs: search syntax for Lucene, the SQL reference for the where field with whereLanguage:"sql", and the sql-visualizations page for the $__timeFilter / $__timeInterval / $__filters macros used in raw SQL tiles. DEFAULT TIME WINDOW: short note in the create prompt that dashboards open at 15 minutes and the user-side time picker is currently the only way to widen the window. Tells the agent to surface this to the user rather than padding individual tile time ranges to compensate. Snapshot tests extended to lock the new rules in place: design checklist now iterates 1 to 12, asserts on the strengthened phrasings ("ALIAS EVERY SELECT ITEM", "REQUIRED at five or more tiles", "VALIDATE EVERY TILE AFTER SAVE", "NO TITLE-RECAP MARKDOWN TILE"), plus per-section coverage tests for the GROUPBY ALIASES workaround, the Lucene gotcha (in both LUCENE FILTER SYNTAX and COMMON MISTAKES), the metric-source workaround text, and the three reference URLs.

…ond pass Watched Claude run on a fresh stack with the previous prompt revision; this commit closes the remaining gaps surfaced by that run. DESIGN CHECKLIST changes: - Rule 2 (alias) now ships a copy-pasteable number-tile example next to the rule body. The prior phrasing got better tables but Claude still dropped aliases on every builder number tile across three dashboards (P50/P95/P99 latency, Server Requests, Failed Spans, Log Errors). Concrete shape next to the text is what makes it stick. - Rule 10 (containers) now ships a copy-pasteable containers/tabs block. The prior REQUIRED phrasing did not bite at all; Claude built five dashboards with 9-10 tiles each and zero containers. The model reads "REQUIRED" as a soft hint when the next concrete shape is two files away in the dashboard_examples prompt. Inline the shape and the tile -> containerId wiring directly in the rule. WORKFLOW changes: - Now six steps. Step 2 says "list existing dashboards with hyperdx_get_dashboard, fetch one or two, learn local idioms before designing", which Claude's reflection asked for. Step 4 says "sketch tiles, THEN group them into 2-4 containers before assembling the save payload" so the container decision happens at the same level as picking source kind, not buried in a checklist. QUERY GUIDE changes: - LUCENE FILTER SYNTAX gotcha broadened from "comparison and wildcard on dotted map-attribute paths" to "ALL Lucene operations on map-attribute paths". Claude saw db.system:mongodb (simple equality) also fail to translate to SpanAttributes['db.system'] on this round. The previous phrasing said "simple equality on a dotted path is fine"; that was wrong. The corrected guidance is: use SQL with bracket access for any operation on map-attribute keys. - Added a FUZZY-MATCH NOTE alongside the gotcha: Lucene field:value on top-level string columns translates to ilike(field, '%value%'), not equality. That is fine for free-text columns but surprising for enum-like ones (SpanKind, SeverityText, StatusCode). Claude hit this on SpanKind:Server matching broader strings than intended. The canonical fix is whereLanguage: "sql" with =. - COMMON MISTAKES rule 14 rewritten to cover any-operation, with db.system:mongodb as the equality example. - Rule 15 added for the fuzzy-substring trap on top-level enum-like columns. - Rule 16 added for the empty-string trap on map-attribute groupBy: ClickHouse map columns return '' for unset keys, so groupBy on a sparse attribute produces an empty bucket alongside real values. Workaround: where: "SpanAttributes['<key>'] != ''" alongside the groupBy. Comes from Claude's MongoDB dashboard work where the empty collection-name bucket was the loudest visual noise. - Renumbered the onClick mistakes from 15-18 to 17-20 to make room. Tests extended to assert: workflow has six numbered steps and names the read-existing and group-into-containers steps explicitly; rule 2 carries the Server Requests example; rule 10 carries the kpis/trends/ errors shape; LUCENE FILTER SYNTAX mentions both wildcard and equality gotchas plus the FUZZY-MATCH NOTE; COMMON MISTAKES has dedicated rules for the map-attr-any-operation gotcha, the enum-like fuzzy-match trap, and the empty-string map-attr trap. All 6 example patterns still save and query cleanly end-to-end (48 tiles checked through hyperdx_query_tile).

Adds a new rule 3 GROUP BY HAS NO ALIAS HOOK that calls out the schema limit on the v2 chart config: groupBy is a single expression string, so the renderer uses it verbatim as the table column header and as the raw column name in CSV export, tooltips, orderBy, and onClick template references. A table grouped by SpanAttributes['http.route'] renders arrayElement(SpanAttributes, 'http.route') as the header. The rule pushes the model toward a top-level column when one carries the same semantic (SpanName / ServiceName / SeverityText) and falls back to a raw SQL tile with AS alias when only a Map attribute is available. Second-pass dashboards consistently produced ugly Map-expression headers on App-Performance tables even though every alias rule on the select side was already in the checklist; the gap was the missing acknowledgement that groupBy has no alias hook. Also fixes a pre-existing assertion in dashboards.test.ts left over from the rewrite: the test expected the phrase "not currently exposed via the MCP" which is no longer in the PER-TILE TYPE CONSTRAINTS section. Anchors on the surviving "Authoring builder tiles on a metric source is not reliable" and "MCP select-item shape does not carry" phrasing instead.

Drew's #2273 shipped `mcpDashboardFilterSchema` + `mcpFiltersParam` with explicit id-management semantics in the description (telling the LLM exactly how to handle filter ids on create vs update). The pre-rebase version of this branch defined `mcpFiltersParam` as `z.array(externalDashboardFilterSchemaWithId.partial({ id: true }))` instead, which loses Drew's id description. Aligning back to main's design keeps the richer guidance the LLM benefits from and avoids overriding a schema another PR shipped upstream four days earlier. The wider `(ExternalDashboardFilter | ExternalDashboardFilterWithId)[]` input type on saveDashboard.ts still ships, so REST/MCP parity is unaffected. `externalDashboardFilterSchemaWithId` is no longer used in schemas.ts; the import is dropped.

Five em-dashes (U+2014) in the dashboard authoring prompt content surface to the LLM and to human reviewers. Per project voice rules, replace with colon / period equivalents that read the same to readers and to LLMs. Affected: - filter-id self-describing rule (two occurrences) - pattern-mining query_guide bullet list (where / sampleSize / bodyExpression)

Deep-review (P2 #1, #2264 comment on 2026-05-12): the rewritten mcpOnClickDashboardSchema.filters description dropped the warning that an onClick filter whose expression is not declared on the destination dashboard is dropped at click time. Restore one sentence on the onClick schema description and the matching wording in the dashboards prompt content. Also drop three em-dashes from mcpDashboardFilterSchema descriptions introduced by 9444c2c; project voice rules ban em-dashes in reviewer-facing prose.

alex-fedotyev · 2026-05-19T20:54:17Z

Yes, ready for review. Rebased on main (was 10 behind, conflicting) and pushed 3c7a316.

What I did in this pass:

Rebased on main. One real conflict at content.ts COMMON MISTAKES item 4 from the hyperdx_query to per-displayType split in refactor(mcp): split hyperdx_query into 5 display-type-specific tools #2294. Resolved by keeping the broader scope (number / pie / heatmap) and folding in the new hyperdx_table auto-upgrade note as a clarification line. CI re-running on the rebased branch.
Addressed the deep-review correctness finding on mcpOnClickDashboardSchema.filters (the description had dropped the silent-drop warning). Restored a sentence on the schema description and the matching content.ts:994 filters wording. Also dropped three em-dashes from mcpDashboardFilterSchema descriptions introduced by 9444c2c to keep the file voice-clean.

What's NOT in this pass, deliberately, since it's an unblock pass and not a full sweep:

assignFilterIds foreign-id handling: real API-contract decision (reject vs warn), deserves its own change.
Three test-only follow-ups: having round-trip persistence assertion, empty-string id branch coverage, filters empty-array vs omitted semantics.
Structural parity test between mcpDashboardFilterSchema and externalDashboardFilterSchemaWithId.
All P3 nits.

If you'd rather I roll the assignFilterIds fix into this PR before review, happy to; otherwise I'll track it as a follow-up so this one stays focused on the prompt-rewrite + dashboard filters surface.

(Edited: rewrote the deferred-items list to drop #N notation that GitHub was auto-linking to unrelated issues.)

brandon-pereira · 2026-05-19T21:30:15Z

      .optional()
      .describe(
-        'Display formatting for bucket values. Example: { output: "time", factor: 0.000000001 } ' +
+        'Display formatting for bucket values. Example: { output: "duration", factor: 0.000000001 } ' +


Wouldn't time be better here?

Kept output: "duration" intentionally. The heatmap chart at packages/app/src/components/DBHeatmapChart.tsx:1091 branches on numberFormat?.output === 'duration' and calls formatDurationMsCompact(value * factor * 1000), which renders bucket values as "1.2s", "45ms", "3µs" (trace-waterfall style). output: "time" falls through to numbro's time formatter which is clock-style HH:MM:SS, wrong for sub-second latencies. The design checklist in the same prompt (rule 8) also prescribes { output: "duration", factor: 0.000000001 } for heatmaps, so the schema example now lines up with that.

brandon-pereira · 2026-05-19T21:34:57Z

+      });
+    });
+
+    it('should reject a filter whose sourceId does not exist', async () => {


We probably already have a test for this, if not, it should be a generic sourceId validation test not specific to filters

Good call, dropped in b8115f9. The tile-level missing-source test at line 295 exercises the same getMissingSources(sources, tilesWithId, filters) codepath in saveDashboard.ts:217, so the filter-specific variant was duplicate coverage.

brandon-pereira · 2026-05-19T21:35:26Z

+      expect(text).toContain(missingSourceId);
+    });
+
+    it('should round-trip a dashboard with no filters (backward compat)', async () => {


No need for this - other tests should cover this. No backward compat needed yet

Dropped in b8115f9. Existing create/update tests in the same file save dashboards without a filters field and implicitly cover the empty-array default.

brandon-pereira

overall, lgtm. a couple nits

Two of the four filter tests Brandon flagged as redundant on #2264: - The filter-specific missing-source test duplicated the existing tile-level test at line 295. saveDashboard's getMissingSources(sources, tilesWithId, filters) walks both in one pass, so a tile-level miss exercises the same codepath. - The no-filters round-trip was framed as a backward-compat guard, but the create/update suite already saves dashboards without a filters field and implicitly covers the empty-array default.

github-actions Bot added the review/tier-3 Standard — full human review required label May 12, 2026

vercel Bot deployed to Preview May 12, 2026 22:32 View deployment

vercel Bot deployed to Preview May 12, 2026 22:41 View deployment

vercel Bot deployed to Preview May 12, 2026 23:36 View deployment

alex-fedotyev marked this pull request as draft May 14, 2026 00:19

alex-fedotyev force-pushed the alex/HDX-mcp-prompts-overhaul branch from cd37010 to be223fb Compare May 14, 2026 00:39

vercel Bot deployed to Preview May 14, 2026 00:43 View deployment

vercel Bot deployed to Preview May 14, 2026 02:04 View deployment

vercel Bot deployed to Preview May 14, 2026 03:02 View deployment

vercel Bot deployed to Preview May 14, 2026 16:27 View deployment

alex-fedotyev force-pushed the alex/HDX-mcp-prompts-overhaul branch from e409a29 to 5ed6ca9 Compare May 19, 2026 01:28

vercel Bot had a problem deploying to Preview May 19, 2026 01:30 Failure

alex-fedotyev force-pushed the alex/HDX-mcp-prompts-overhaul branch from 5ed6ca9 to 94bdfb2 Compare May 19, 2026 01:59

vercel Bot deployed to Preview May 19, 2026 02:04 View deployment

alex-fedotyev marked this pull request as ready for review May 19, 2026 02:14

github-actions Bot added review/tier-4 Critical — deep review + domain expert sign-off and removed review/tier-3 Standard — full human review required labels May 19, 2026

alex-fedotyev requested a review from brandon-pereira May 19, 2026 02:34

alex-fedotyev and others added 5 commits May 19, 2026 20:47

feat: Add dashboard table onClick to MCP schemas and prompts

1e2ed22

alex-fedotyev added 6 commits May 19, 2026 20:48

alex-fedotyev force-pushed the alex/HDX-mcp-prompts-overhaul branch from 94bdfb2 to 3c7a316 Compare May 19, 2026 20:53

vercel Bot deployed to Preview May 19, 2026 20:57 View deployment

brandon-pereira reviewed May 19, 2026

View reviewed changes

vercel Bot deployed to Preview May 20, 2026 00:16 View deployment

brandon-pereira approved these changes May 20, 2026

View reviewed changes

alex-fedotyev added the automerge label May 20, 2026

Merge branch 'main' into alex/HDX-mcp-prompts-overhaul

00d360d

vercel Bot deployed to Preview May 20, 2026 16:08 View deployment

kodiakhq Bot merged commit c3a8aa5 into main May 20, 2026
18 checks passed

kodiakhq Bot deleted the alex/HDX-mcp-prompts-overhaul branch May 20, 2026 16:15

Conversation

alex-fedotyev commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What lands

Co-authorship and rebase

Schema design note

Test plan

Uh oh!

changeset-bot Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔴 Tier 4 — Critical

Uh oh!

github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review

Uh oh!

github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Test Results

Uh oh!

github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deep Review

🟡 P2 — recommended

Uh oh!

teeohhem commented May 19, 2026

Uh oh!

alex-fedotyev commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandon-pereira May 19, 2026

Choose a reason for hiding this comment

Uh oh!

alex-fedotyev May 20, 2026

Choose a reason for hiding this comment

Uh oh!

brandon-pereira May 19, 2026

Choose a reason for hiding this comment

Uh oh!

alex-fedotyev May 20, 2026

Choose a reason for hiding this comment

Uh oh!

brandon-pereira May 19, 2026

Choose a reason for hiding this comment

Uh oh!

alex-fedotyev May 20, 2026

Choose a reason for hiding this comment

Uh oh!

brandon-pereira left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alex-fedotyev commented May 12, 2026 •

edited

Loading

changeset-bot Bot commented May 12, 2026 •

edited

Loading

vercel Bot commented May 12, 2026 •

edited

Loading

github-actions Bot commented May 12, 2026 •

edited

Loading

github-actions Bot commented May 12, 2026 •

edited

Loading

github-actions Bot commented May 12, 2026 •

edited

Loading

github-actions Bot commented May 12, 2026 •

edited

Loading

alex-fedotyev commented May 19, 2026 •

edited

Loading