Skip to content

Conversation

@vmlinuzx
Copy link

@vmlinuzx vmlinuzx commented Jan 16, 2026

Summary

Replace hardcoded 30-minute task timeout with configurable three-phase timing.

Problem

The current TASK_TTL_MS = 30 * 60 * 1000 (30 minutes) is hardcoded and excessive:

  • Most tasks complete in under 8 minutes
  • If a task takes 30 minutes, the task scoping is wrong
  • Orchestrator sits waiting on timed-out tasks for way too long
  • No way to configure fast-fail for quick tasks

Solution

Three-phase configurable timing:

Setting Default Purpose
initial_wait_ms 5 min Wait before first poll (let task spin up)
poll_interval_ms 60 sec Polling frequency after initial wait
timeout_ms 15 min Hard timeout - task killed after this

Configuration Levels

1. Global defaults (oh-my-opencode.json):

{
  "background_task": {
    "default_initial_wait_ms": 300000,
    "default_poll_interval_ms": 60000,
    "default_timeout_ms": 900000
  }
}

2. Per-category (quick tasks fail fast):

{
  "categories": {
    "quick": {
      "model": "google/gemini-3-flash",
      "initial_wait_ms": 60000,
      "poll_interval_ms": 30000,
      "timeout_ms": 180000
    }
  }
}

3. Per-task when launching via sisyphus_task.

Changes

  • src/config/schema.ts: Added timing fields to CategoryConfigSchema and BackgroundTaskConfigSchema
  • src/features/background-agent/types.ts: Added timing fields to BackgroundTask and LaunchInput
  • src/features/background-agent/manager.ts:
    • Store default timing from config
    • Use per-task timing in pruning
    • Respect initial_wait before polling
    • Use minimum poll interval from running tasks
    • Timeout triggers notification to parent (not silent prune)

Testing

  • All 46 background-agent tests pass
  • Type check passes
  • Build succeeds

Breaking Changes

None - all fields are optional with backward-compatible defaults.


Summary by cubic

Make background-agent task timing configurable. Adds initial wait, poll interval, and timeout controls to help quick tasks fail fast and reduce noisy polling.

  • New Features
    • Adds initial_wait_ms, poll_interval_ms, timeout_ms with defaults: 5m, 60s, 15m.
    • Configurable at three levels: global defaults, per-category, and per-task at launch.
    • Respects initial wait before polling; polling uses the smallest interval across running tasks.
    • Per-task timeouts replace the old global TTL and now notify the parent on timeout.
    • Backward compatible; defaults apply when fields are not set.

Written for commit b285330. Summary will update on new commits.

…interval, timeout)

Replace hardcoded 30-minute task timeout with configurable three-phase timing:

- initial_wait_ms: Wait before first poll (default: 5 min)
- poll_interval_ms: Polling frequency after initial wait (default: 60 sec)
- timeout_ms: Hard timeout, task killed after this (default: 15 min)

Configurable at three levels:
1. Global defaults via background_task config
2. Per-category overrides in categories config
3. Per-task when launching via sisyphus_task

The 30-minute hardcoded timeout was excessive - most tasks complete in
under 8 minutes, and if they don't, the task scoping is wrong. This
allows quick tasks to fail fast while giving complex tasks appropriate
time.

Also improves timeout behavior:
- Timeout now triggers notification to parent (not silent prune)
- Per-task timeout instead of global constant
- Respects initial wait before polling (reduces API spam)
@github-actions
Copy link
Contributor

Thank you for your contribution! Before we can merge this PR, we need you to sign our Contributor License Agreement (CLA).

To sign the CLA, please comment on this PR with:

I have read the CLA Document and I hereby sign the CLA

This is a one-time requirement. Once signed, all your future contributions will be automatically accepted.


I have read the CLA Document and I hereby sign the CLA


DC seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b285330fb6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 762 to +766
if (task.concurrencyKey) {
this.concurrencyManager.release(task.concurrencyKey)
}
this.clearNotificationsForTask(taskId)
this.markForNotification(task)
this.notifyParentSession(task).catch(err => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid double-releasing concurrency on timeout

When a task times out, pruneStaleTasksAndNotifications releases the concurrencyKey and then calls notifyParentSession, which also releases and clears the same key. If there are queued tasks for that key, the second release will dequeue another waiter, effectively granting two concurrency slots for one timed-out task and violating the configured limits. Consider letting notifyParentSession handle the release (or clearing the key before calling it) to keep counts consistent.

Useful? React with 👍 / 👎.

Comment on lines 568 to 572
private startPolling(): void {
if (this.pollingInterval) return

const minPollInterval = this.getMinPollInterval()
this.pollingInterval = setInterval(() => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Recompute poll interval when new tasks start

startPolling returns immediately if a polling interval already exists, so getMinPollInterval() only runs on the first call. If a later task specifies a smaller poll_interval_ms, the timer keeps the earlier (larger) interval and that task is polled more slowly than configured. This breaks the new per-task timing behavior for “fast” categories unless the interval is recalculated/restarted when tasks are added.

Useful? React with 👍 / 👎.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 4 files

Confidence score: 2/5

  • Timed-out tasks in src/features/background-agent/manager.ts are removed immediately after notification, so background_output cannot report them and users lose visibility into timed-out work—clear regression risk.
  • Fatal errors are all surfaced as “Agent not found,” masking real failures in src/features/background-agent/manager.ts and making diagnosis difficult.
  • Polling interval never recalculates after start in src/features/background-agent/manager.ts, so newly queued fast-poll tasks lag behind their intended cadence.
  • Pay close attention to src/features/background-agent/manager.ts - several timeout, error-reporting, and polling bugs threaten correctness and concurrency guarantees.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/features/background-agent/manager.ts">

<violation number="1" location="src/features/background-agent/manager.ts:213">
P2: Fatal errors are always reported as “Agent not found,” masking unrelated failures</violation>

<violation number="2" location="src/features/background-agent/manager.ts:571">
P2: Polling interval is never recomputed after start, so newly added tasks with shorter poll_interval_ms are polled too slowly until polling stops.</violation>

<violation number="3" location="src/features/background-agent/manager.ts:766">
P2: Concurrency key is released twice on timeout, which can undercount and violate the configured concurrency limit.</violation>

<violation number="4" location="src/features/background-agent/manager.ts:766">
P1: Timed-out tasks are deleted immediately after triggering notification, so background_output cannot retrieve them and the completion summary omits the timed-out task.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

}
this.clearNotificationsForTask(taskId)
this.markForNotification(task)
this.notifyParentSession(task).catch(err => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Timed-out tasks are deleted immediately after triggering notification, so background_output cannot retrieve them and the completion summary omits the timed-out task.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/background-agent/manager.ts, line 766:

<comment>Timed-out tasks are deleted immediately after triggering notification, so background_output cannot retrieve them and the completion summary omits the timed-out task.</comment>

<file context>
@@ -713,15 +752,20 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
         }
-        this.clearNotificationsForTask(taskId)
+        this.markForNotification(task)
+        this.notifyParentSession(task).catch(err => {
+          log("[background-agent] Failed to notify on timeout:", err)
+        })
</file context>

} else {
existingTask.error = errorMessage
}
existingTask.error = `Agent "${input.agent}" not found. Make sure the agent is registered in your opencode.json or provided by a plugin.`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Fatal errors are always reported as “Agent not found,” masking unrelated failures

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/background-agent/manager.ts, line 213:

<comment>Fatal errors are always reported as “Agent not found,” masking unrelated failures</comment>

<file context>
@@ -173,16 +184,33 @@ export class BackgroundManager {
-        } else {
-          existingTask.error = errorMessage
-        }
+        existingTask.error = `Agent "${input.agent}" not found. Make sure the agent is registered in your opencode.json or provided by a plugin.`
         existingTask.completedAt = new Date()
         if (existingTask.concurrencyKey) {
</file context>
Suggested change
existingTask.error = `Agent "${input.agent}" not found. Make sure the agent is registered in your opencode.json or provided by a plugin.`
const fatalErrorMessage = (errorMessage.includes("agent.name") || errorMessage.includes("not registered"))
? `Agent "${input.agent}" not found. Make sure the agent is registered in your opencode.json or provided by a plugin.`
: errorMessage
existingTask.error = fatalErrorMessage

private startPolling(): void {
if (this.pollingInterval) return

const minPollInterval = this.getMinPollInterval()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Polling interval is never recomputed after start, so newly added tasks with shorter poll_interval_ms are polled too slowly until polling stops.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/background-agent/manager.ts, line 571:

<comment>Polling interval is never recomputed after start, so newly added tasks with shorter poll_interval_ms are polled too slowly until polling stops.</comment>

<file context>
@@ -540,12 +568,23 @@ export class BackgroundManager {
   private startPolling(): void {
     if (this.pollingInterval) return
 
+    const minPollInterval = this.getMinPollInterval()
     this.pollingInterval = setInterval(() => {
       this.pollRunningTasks()
</file context>

}
this.clearNotificationsForTask(taskId)
this.markForNotification(task)
this.notifyParentSession(task).catch(err => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Concurrency key is released twice on timeout, which can undercount and violate the configured concurrency limit.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/background-agent/manager.ts, line 766:

<comment>Concurrency key is released twice on timeout, which can undercount and violate the configured concurrency limit.</comment>

<file context>
@@ -713,15 +752,20 @@ Use \`background_output(task_id="${task.id}")\` to retrieve this result when rea
         }
-        this.clearNotificationsForTask(taskId)
+        this.markForNotification(task)
+        this.notifyParentSession(task).catch(err => {
+          log("[background-agent] Failed to notify on timeout:", err)
+        })
</file context>
Suggested change
this.notifyParentSession(task).catch(err => {
if (task.concurrencyKey) {
task.concurrencyKey = undefined
}
this.notifyParentSession(task).catch(err => {

@GollyJer
Copy link
Collaborator

Thanks for the PR @vmlinuzx . I like the idea but feel like this adds maybe more complexity in configuration than necessary.
Would like to get others opinions.

Please work through cubic issues if you want this to have a better chance for merging. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants