Pr/sandbox streaming tools#584
Conversation
…ow up to 1h - Remove hardcoded 60s timeout cap in _execute_code, _execute_code_legacy, and subprocess_backend - Clamp timeout by sandbox_config.max_timeout (configurable up to 3600s) - Update tool_seeder config_schema max values from 300 to 3600 - Update SandboxConfig Pydantic validation to allow up to 3600s - Remove 'no network access commands' from execute_code tool description - Guide agent to pip install missing packages instead of limiting to stdlib
… (default 60, max 3600)
… times out - Modified local subprocess sandbox to capture and decode stdout/stderr after proc.kill() - Legacy fallback also modified to append output before returning timeout error - This provides agents with visibility into what ran successfully before the 60s/3600s cap
- Added explicit instructions to the timeout error message telling the agent it can retry with a higher 'timeout' parameter up to 3600s. - This empowers the agent to proactively recover from timeouts for long-running scripts instead of halting.
- asyncio.create_subprocess_exec with proc.communicate() loses output when cancelled by asyncio.wait_for() - Switched to manual async reading of proc.stdout/stderr streams to ensure partial output is retained after proc.kill()
- Add on_output callback chain: WebSocket → caller → execute_tool → subprocess - Each stdout/stderr chunk is pushed as agentbay_live event (env: code) - Frontend concatenates streaming chunks, auto-opens live panel - Add Clear button to code output panel - Remove redundant code live_preview from tool_call done events
…user bubbles AgentDetail.tsx had no handler for 'agentbay_live' WebSocket events, causing real-time code execution output to fall through to the catch-all else branch, which created spurious user message bubbles. Now properly handles code/desktop/browser streaming events with live panel auto-focus, matching the Chat.tsx behavior.
- Track explicitly disabled tools (AgentTool.enabled=False) to prevent _always_tools bypass from re-adding them (core/feishu/channel tools) - Add agentbay tools to SYNC_IS_DEFAULT_TOOL_NAMES so seeder corrects stale is_default=True from older deployments - Add diagnostic logging: final tool list, assignment count, disabled count, and default_fallback count for debugging tool visibility
When an agent has any AgentTool assignments (tool panel has been configured), only include tools with an explicit AgentTool(enabled=True) record. Tools without any AgentTool record are no longer auto-included via is_default fallback — they are only provided by _always_tools if they are core system tools. For brand-new agents with zero assignments, the old is_default behavior is preserved so they get a reasonable starting tool set. This fixes the issue where 32+ tools were being sent to the LLM despite the user having configured the tool panel, because those tools had no AgentTool records and fell through to is_default=True.
When a configured agent (has existing AgentTool assignments) loads its tool panel, automatically create AgentTool records for any visible tool that doesn't have one yet. Uses is_default as the initial enabled value. This ensures the UI state and get_agent_tools_for_llm stay in sync: both now rely on explicit AgentTool records instead of the implicit is_default fallback. Without this, tools like send_message_to_agent show as enabled in the UI but are skipped by the LLM tool loader.
flush() only sends SQL to the DB but doesn't persist the transaction. The FastAPI get_db session doesn't auto-commit, so backfilled records were being rolled back when the session closed.
Quick models (GPT-4o-mini etc.) often output plain text instead of calling finish() in A2A consult mode. The existing A2A injection said 'Reply concisely' without mentioning finish(), causing the model to skip it. Added explicit finish() mandate with rejection warning right in the A2A injection block.
Completed focus items were reinforcing stale workflow patterns (e.g. old send_message_to_agent-only flow) over updated soul.md instructions. Agents can still query focus via list_focus_items tool on demand.
When send_file_to_agent copies a file to the target's inbox, it now also inserts a ChatMessage into the A2A session. This means when send_message_to_agent is called next, the target agent's conversation history already contains the file delivery info with the exact path to read_file, solving the problem where target agents had no awareness that a file was delivered.
source_agent.creator_id was accessed after the ORM session closed, causing the file delivery message injection to silently fail. Now extracted as source_creator_id local variable inside the active session.
…y injection ChatMessage is in app.models.audit, ChatSession is in app.models.chat_session. The previous import from app.models.chat caused ModuleNotFoundError which was silently caught, preventing file delivery messages from being injected.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f91571e426
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| # Configured agent: require explicit AgentTool record | ||
| if at is None: | ||
| # No assignment → not included (unless _always_tools adds it) | ||
| default_included_names.append(t.name) | ||
| continue |
There was a problem hiding this comment.
Preserve default tools when assignments are partial
This logic treats len(assignments) > 0 as “fully configured” and drops every tool without an AgentTool row. In practice assignments are often partial (the UI toggles one tool at a time via PUT /tools/agents/{id}), and the /with-config loader path does not backfill missing rows, so after a single toggle many previously available default tools disappear from the LLM tool payload unexpectedly. That changes agent behavior immediately and can break existing workflows that rely on still-default tools.
Useful? React with 👍 / 👎.
| except asyncio.TimeoutError: | ||
| proc.kill() | ||
| await proc.communicate() | ||
| is_timeout = True | ||
|
|
||
| await asyncio.gather(task1, task2) |
There was a problem hiding this comment.
Reap timed-out subprocesses after kill
On timeout this path calls proc.kill() and then returns without awaiting process termination (wait/communicate). Under repeated timeouts, unreaped children can accumulate as zombies and degrade long-running workers. The previous implementation awaited communicate() after kill; the streaming version still needs an explicit reap step before returning timeout results.
Useful? React with 👍 / 👎.
当前分支 (pr/sandbox-streaming-tools) 的修改非常纯粹,核心是从之前混合了多种功能的大分支中抽离出的 3 个独立特性模块。根据 commit 记录,这 16 个 commit 总结如下:
实时流式输出:在前端代码面板 (AgentBayLivePanel.tsx 和 Chat.tsx) 中实现了 execute_code 执行过程的实时 stdout/stderr 渲染。
动态超时配置:移除了沙盒硬编码的 60 秒超时限制,改为从配置读取 (config.py),支持最高 1 小时 (3600s) 的运行时间,并同步更新了工具的描述提示。
超时截断保护:即使代码执行超时,现在也能正确捕获并返回超时前已输出的截断 stdout/stderr,并在报错信息中增加了重试提示 (retry hint)。
屏蔽已禁用工具:在向大模型发送 payload 时,严格尊重用户在界面上禁用的工具状态 (respect user-disabled tools in LLM payload)。
自动回填记录:当用户打开工具面板时,自动回填缺失的 AgentTool 记录,并将数据库写入操作从 flush() 改为更安全的 commit()。
过滤冗余工具:对于已配置的智能体,自动跳过数据库中没有 AgentTool 关联记录的冗余工具。
文件注入会话:实现了将系统底层的文件传递 (File Delivery) 消息直接注入到 A2A 聊天会话中的功能。
修复注入 Bug:修复了在触发文件传递消息时导致的 DetachedInstanceError 报错,并修正了相关的 ChatMessage 和 ChatSession 导入路径。
完善协议:在 A2A 注入阶段强化了 finish() 通信协议,并移除了系统提示词中不再需要的 Focus 注入机制。
修复前端重影:在 AgentDetail.tsx 中正确处理了 agentbay_live 事件,防止聊天界面出现虚假的“幽灵用户气泡” (ghost user bubbles)。
这个分支目前状态很干净,仅涉及上述三个模块的后端逻辑 (services/sandbox, services/llm, api/tools.py 等) 及对应的前端展示组件,没有任何对其他功能(如 SkillsTab)的多余改动,已经可以直接用于提 PR 了!