Skip to content

fix(logs): split summary/detail contracts to make trace tab gate type-safe#4430

Closed
waleedlatif1 wants to merge 48 commits intostagingfrom
waleedlatif1/logs-traces-loading-bug
Closed

fix(logs): split summary/detail contracts to make trace tab gate type-safe#4430
waleedlatif1 wants to merge 48 commits intostagingfrom
waleedlatif1/logs-traces-loading-bug

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

Summary

The Trace tab was silently missing from the Log Details sidepanel because list and detail rows shared one WorkflowLog type with executionData: z.unknown(). The UI couldn't distinguish a summary row (no spans) from a detail row (with spans), so the tab gate read undefined and hid itself.

  • Splits into WorkflowLogSummary (list) and WorkflowLogDetail (typed executionData with optional traceSpans) — makes the bug structurally impossible.
  • Detail and new by-execution routes both write through to the same logKeys.detail(id) cache, eliminating two-key fragmentation that caused the merge memo workaround.
  • List route moves to cursor pagination on (sortValue, id) with proper NULLS LAST handling and SQL-side sort across workflow + job execution tables (replaces in-memory merge that dropped rows under filters).
  • Detail route now requires and asserts workspaceId. Deep-link path uses useLogByExecutionId instead of auto-paginating the entire workspace.

Test plan

  • Open Logs page, click any workflow execution → Trace tab appears with correct data
  • Navigate via arrow keys 10× rapidly → no flicker, no missing tabs
  • Right-click a row → preview snapshot → sidebar still shows correct selected log
  • Deep-link ?executionId=<id> → resolves in one round-trip, sidebar opens with detail (even for off-page logs)
  • Sort by cost → server returns sorted results across all pages; Load more works
  • Filter by status=error → page 2 contains correct rows
  • Cancel a running execution → one mutation, invalidates lists() + detail(id), no stats refetch
  • Cross-workspace test: deep-link to ?executionId=<id from other workspace> → 404

🤖 Generated with Claude Code

waleedlatif1 and others added 30 commits April 3, 2026 23:30
…ership workflow edits via sockets, ui improvements
…ration, signup method feature flags, SSO improvements
* feat(posthog): Add tracking on mothership abort (#4023)

Co-authored-by: Theodore Li <theo@sim.ai>

* fix(login): fix captcha headers for manual login  (#4025)

* fix(signup): fix turnstile key loading

* fix(login): fix captcha header passing

* Catch user already exists, remove login form captcha
…nts, secrets performance, polling refactors, drag resources in mothership
…endar triggers, docs updates, integrations/models pages improvements
…mat, logs performance improvements

fix(csp): add missing analytics domains, remove unsafe-eval, fix workspace CSP gap (#4179)
fix(landing): return 404 for invalid dynamic route slugs (#4182)
improvement(seo): optimize sitemaps, robots.txt, and core web vitals across sim and docs (#4170)
fix(gemini): support structured output with tools on Gemini 3 models (#4184)
feat(brightdata): add Bright Data integration with 8 tools (#4183)
fix(mothership): fix superagent credentials (#4185)
fix(logs): close sidebar when selected log disappears from filtered list; cleanup (#4186)
v0.6.46: mothership streaming fixes, brightdata integration
waleedlatif1 and others added 18 commits April 21, 2026 20:25
…rity hardening, contact page, 404 page, access control, SES, SNS
v0.6.54: migration error logs
…ze, subagent thinking, files sorting, agentphone integration
fix(db): revert statement_timeout startup options breaking pooled connections (#4284)
v0.6.57: mothership reliability, ashby refactor, tables row count, copilot id fix, bun upgrade
…rizations, mothership positional table row insertion, CI improvements, org-external users, file viewer improvements
v0.6.62: fix new copilot chat creation and selection on refresh
…ixes, db query optimizations, contract boundaries code hygiene, CORS, toast improvements, tables infinite query, executor robustness, reranker support
…-safe

The Trace tab was silently missing from the Log Details sidepanel because
list and detail rows shared one WorkflowLog type with executionData:
z.unknown(). The UI couldn't distinguish a summary row (no spans) from a
detail row (with spans), so the tab gate read undefined and hid itself.

Splits into WorkflowLogSummary (list) and WorkflowLogDetail (typed
executionData with optional traceSpans). Detail and by-execution routes
both write through to the same logKeys.detail(id) cache, eliminating the
two-key fragmentation that caused the merge memo workaround. List route
moves to cursor pagination on (sortValue, id) with proper NULLS LAST
handling and SQL-side sort across workflow + job execution tables.
Detail route now requires and asserts workspaceId. Deep-link path uses
useLogByExecutionId instead of auto-paginating the entire workspace.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Building Building Preview, Comment May 4, 2026 6:01pm

Request Review

@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented May 4, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
29606901 Triggered Generic High Entropy Secret a54dcbe apps/sim/providers/utils.test.ts View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@cursor
Copy link
Copy Markdown

cursor Bot commented May 4, 2026

PR Summary

Medium Risk
Medium risk because it changes the GET /api/logs response shape and pagination/sorting semantics (offset→cursor) plus requires workspaceId for detail fetches, impacting multiple UI entrypoints and caching behavior.

Overview
Fixes log detail vs list type ambiguity by splitting API contracts into WorkflowLogSummary (list) and WorkflowLogDetail (detail with typed executionData/traceSpans), and updating clients to fetch detail via getLogDetailContract with required workspaceId.

Adds GET /api/logs/by-execution/[executionId] plus a useLogByExecutionId hook to resolve deep-links in one request and write-through to the canonical logKeys.detail(id) cache; embedded log views now pass workspaceId when fetching details.

Reworks GET /api/logs to cursor pagination with server-side sorting (date/duration/cost/status) across workflow and job logs, returning { data, nextCursor } and removing the previous offset/totalPages response and client-side sorting/merging assumptions; Trace tab gating is adjusted to show the tab for likely executions and render an explicit “no trace data” empty state.

Reviewed by Cursor Bugbot for commit 551513b. Configure here.

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

Recreating with clean commit history off staging.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 4, 2026

Greptile Summary

This PR fixes the missing Trace tab in the Log Details sidebar by splitting the shared WorkflowLog type into a typed WorkflowLogSummary (list) and WorkflowLogDetail (with executionData/traceSpans), and migrates the list endpoint from offset pagination to cursor-based pagination with SQL-side sorting. Two P1 regressions need attention before merge:

  • Cross-table cursor tie-breaking bug: The cursor id from the last merged page item (a UUID from one table) is applied as a tie-breaker in the other table's SQL query. For status sort — where many rows share the same value — this systematically drops rows at every page boundary.
  • Stats not invalidated after cancel/retry: logKeys.stats() was removed from useCancelExecution and useRetryExecution's onSettled handlers; with the new 30-second staleTime on useDashboardStats, the dashboard counts stay stale for up to 30 seconds after these user actions when live-mode is off.

Confidence Score: 3/5

Not safe to merge as-is — two P1 regressions in pagination correctness and cache invalidation

Two independent P1 bugs: cursor tie-breaking causes dropped rows with status/cost/duration sorts across page boundaries, and stats invalidation removal causes visible stale data after cancel/retry with live-mode off. The core Trace tab fix and contract split are solid.

apps/sim/app/api/logs/route.ts (cursor tie-breaking logic), apps/sim/hooks/queries/logs.ts (missing stats invalidation in mutation onSettled)

Important Files Changed

Filename Overview
apps/sim/lib/api/contracts/logs.ts Core contract split into WorkflowLogSummary (list) and WorkflowLogDetail (with typed executionData/traceSpans); cursor-based pagination replaces page/offset; new getLogByExecutionIdContract added. Schema definitions are thorough and correct.
apps/sim/app/api/logs/route.ts New cursor pagination with SQL-side sort; in-memory merge for cross-table sort still present. Contains a P1 bug: cursor ID tie-breaking uses a UUID from one table as tie-breaker in the other table's query, causing row omissions at page boundaries when sort values collide (especially for status sort).
apps/sim/app/api/logs/[id]/route.ts Now requires workspaceId query param and asserts it in the WHERE clause; adds pausedExecutions join for pause summary. Logic is clean and correct.
apps/sim/app/api/logs/by-execution/[executionId]/route.ts New route for deep-link resolution; nearly identical to [id]/route.ts with executionId lookup. Code is correct and includes proper workspace-scoped permission checks.
apps/sim/hooks/queries/logs.ts Migrated to cursor-based infinite query; useLogDetail now requires workspaceId; useLogByExecutionId writes through to detail cache. Contains a P1 regression: stats cache is no longer invalidated after cancel/retry mutations.
apps/sim/app/workspace/[workspaceId]/logs/logs.tsx Deep-link handling replaced with useLogByExecutionId; selectedLog/previewLog split into separate queries; client-side sort memo removed. Minor P2: missing isInitialized guard causes a double-fetch on initial load.
apps/sim/app/workspace/[workspaceId]/logs/components/log-details/log-details.tsx Core bug fix: showTraceTab now gates on isLikelyExecution instead of requiring traceSpans to be present. Falls back to 'No trace data available' message when detail hasn't loaded yet.
apps/sim/app/workspace/[workspaceId]/home/components/mothership-view/components/resource-content/resource-content.tsx EmbeddedLog now passes workspaceId to useLogDetail, fixing the missing required parameter introduced by this PR.
apps/sim/app/workspace/[workspaceId]/w/[workflowId]/components/panel/components/copilot/components/user-input/hooks/use-mention-data.ts Removed details: 'full' from mention data fetch; now fetches summary-only, consistent with the new API contract.

Sequence Diagram

sequenceDiagram
    participant UI as Logs Page
    participant QC as React Query Cache
    participant ListAPI as GET /api/logs
    participant DetailAPI as GET /api/logs by id
    participant ByExecAPI as GET /api/logs by executionId

    UI->>ListAPI: workspaceId + filters + cursor + sortBy
    ListAPI-->>UI: data WorkflowLogSummary array + nextCursor
    Note over UI,QC: cache stored under logKeys.list

    UI->>UI: User clicks a row
    UI->>DetailAPI: logId + workspaceId
    DetailAPI-->>QC: WorkflowLogDetail with traceSpans
    Note over UI,QC: cache stored under logKeys.detail

    UI->>UI: deep-link executionId param on mount
    UI->>ByExecAPI: executionId + workspaceId
    ByExecAPI-->>QC: WorkflowLogDetail
    Note over UI,QC: also writes to logKeys.detail for that log
    QC-->>UI: dispatch TOGGLE_LOG with resolved logId
Loading

Comments Outside Diff (2)

  1. apps/sim/hooks/queries/logs.ts, line 341-347 (link)

    P1 Dashboard stats go stale after cancel/retry when live-mode is off

    logKeys.stats() invalidation was removed from both useCancelExecution and useRetryExecution. The useDashboardStats hook now has a 30-second staleTime, so after a user cancels or retries a run with live-mode disabled, the stats panel (execution counts, cost totals, etc.) won't refresh for up to 30 seconds. This is a regression: users will see stale aggregate counts immediately after their action completes.

  2. apps/sim/app/workspace/[workspaceId]/logs/logs.tsx, line 1748-1752 (link)

    P2 Extra network round-trip on initial page load

    Removing the isInitialized guard from the enabled condition means useLogsList fires immediately on mount with default filters, then a second time after initializeFromURL() runs in useEffect([], []) and updates state with URL-derived filters. The old guard prevented this by deferring the first fetch until after URL initialization. With 30-second staleTime added to the list query, this double-fetch wastes a network request on every cold page load.

Reviews (1): Last reviewed commit: "fix(logs): split summary/detail contract..." | Re-trigger Greptile

Comment on lines +105 to +113
const buildCursorCondition = (sortExpr: unknown, idCol: unknown): SQL | undefined => {
if (!cursor) return undefined
const v = cursor.v
const id = cursor.id
const cmp = sortOrder === 'asc' ? sql`>` : sql`<`
if (v === null) {
return sql`(${sortExpr} IS NULL AND ${idCol} ${cmp} ${id})`
}
return sql`((${sortExpr} IS NOT NULL AND ${sortExpr} ${cmp} ${v}) OR (${sortExpr} = ${v} AND ${idCol} ${cmp} ${id}) OR ${sortExpr} IS NULL)`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Cross-table cursor ID tie-breaking breaks pagination

The cursor's id field is derived from the last item of the merged page and then used as a tie-breaker in both the workflow and job table queries. When two rows from different tables share the same sort value, the cursor id (a UUID from one table) is compared against IDs from the other table using lexicographic order. For sorts like status where many rows share the same value, this causes systematic row omissions or duplicates across page boundaries — the next page for job logs will skip all job rows whose UUIDs are lexicographically ≥ the cursor ID (which came from a workflow log), even though they should appear on that page.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 551513b. Configure here.

if (cmp !== 0) return sortOrder === 'asc' ? cmp : -cmp
const idCmp = a.id.localeCompare(b.id)
return sortOrder === 'asc' ? idCmp : -idCmp
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In-memory null sort contradicts SQL NULLS LAST ordering

Medium Severity

The compareSortValues function pushes nulls to the end (returns 1 when a is null), but the merge sort negates the comparison for descending order (-cmp), which moves nulls to the beginning. The SQL queries use DESC NULLS LAST, which keeps nulls at the end. This mismatch means the merged workflow+job result has incorrect ordering when sorting descending by columns with nullable values (e.g., duration, cost). The cursor derived from this misordered page won't align with the SQL cursor condition, causing rows to be duplicated or skipped across pages.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 551513b. Configure here.

@waleedlatif1 waleedlatif1 deleted the waleedlatif1/logs-traces-loading-bug branch May 4, 2026 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants