You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: kill OpenHands background daemons after exit to prevent Daytona session hang
OpenHands spawns persistent background processes (tmux, jupyter kernel
gateway, ipykernel, action execution server) that outlive the main
process. These orphans prevent Daytona's session-command from reporting
an exit code, causing Harbor's _poll_response poll loop to hang
indefinitely — leaving the sandbox orphaned in `started` state with no
result collection, verifier run, or teardown.
Claude Code runs are unaffected because they don't spawn persistent
background services.
The fix wraps the upstream OpenHands command in a group and appends
pkill cleanup of known daemon processes, preserving the original exit
code so Harbor proceeds normally through verification and finalization.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: docs/ops/TROUBLESHOOTING.md
+11Lines changed: 11 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,3 +24,14 @@
24
24
3. Check whether failure matches known fingerprint.
25
25
4. Classify as infra / verifier / task / agent behavior.
26
26
5. Choose isolated rerun or fix path.
27
+
28
+
## Daytona / OpenHands Notes
29
+
30
+
- Do not classify a trial as a Daytona image-build stall from `trial.log` alone. Some orphaned or crashed trials leave `trial.log` at `Building environment from ...` even after agent setup succeeded.
31
+
- Before calling it a remote build issue, check for:
32
+
-`agent/setup/return-code.txt`
33
+
-`agent/instruction.txt`
34
+
-`agent/command-0/command.txt`
35
+
- If those files exist, the environment build already progressed past Docker build and the failure is later in launcher orchestration or agent startup handoff.
36
+
- For MCP harness triage, inspect `agent/instruction.txt` first and confirm it names the expected `github.com/sg-evals/...` mirror. A generic repo target such as `github.com/the codebase` indicates prompt wiring drift, not task difficulty.
37
+
-**OpenHands orphaned sandbox / hung harness**: OpenHands spawns persistent background daemons (tmux, jupyter kernel gateway, ipykernel, action execution server) that outlive the main process. These orphans prevent Daytona's session-command from reporting an exit code, causing Harbor's `_poll_response` loop to hang indefinitely. The `OpenHandsHarnessAgent` in `agents/harnesses/openhands/agent.py` includes a `_CLEANUP_SUFFIX` that kills these daemons after the main pipeline exits. If you see a sandbox stuck in `started` state with no harness process running, this is the likely cause. Claude Code runs are unaffected because they don't spawn persistent background services.
0 commit comments