Skip to content

Pr/sandbox streaming tools#584

Open
qet2211pm wants to merge 17 commits into
mainfrom
pr/sandbox-streaming-tools
Open

Pr/sandbox streaming tools#584
qet2211pm wants to merge 17 commits into
mainfrom
pr/sandbox-streaming-tools

Conversation

@qet2211pm
Copy link
Copy Markdown

当前分支 (pr/sandbox-streaming-tools) 的修改非常纯粹,核心是从之前混合了多种功能的大分支中抽离出的 3 个独立特性模块。根据 commit 记录,这 16 个 commit 总结如下:

  1. 沙盒代码执行 (execute_code) 的超时与流式输出优化 (Sandbox & Streaming)
    实时流式输出:在前端代码面板 (AgentBayLivePanel.tsx 和 Chat.tsx) 中实现了 execute_code 执行过程的实时 stdout/stderr 渲染。
    动态超时配置:移除了沙盒硬编码的 60 秒超时限制,改为从配置读取 (config.py),支持最高 1 小时 (3600s) 的运行时间,并同步更新了工具的描述提示。
    超时截断保护:即使代码执行超时,现在也能正确捕获并返回超时前已输出的截断 stdout/stderr,并在报错信息中增加了重试提示 (retry hint)。
  2. 工具管理与状态一致性 (Tool Management)
    屏蔽已禁用工具:在向大模型发送 payload 时,严格尊重用户在界面上禁用的工具状态 (respect user-disabled tools in LLM payload)。
    自动回填记录:当用户打开工具面板时,自动回填缺失的 AgentTool 记录,并将数据库写入操作从 flush() 改为更安全的 commit()。
    过滤冗余工具:对于已配置的智能体,自动跳过数据库中没有 AgentTool 关联记录的冗余工具。
  3. A2A (Agent-to-Agent) 对话及文件传递修复
    文件注入会话:实现了将系统底层的文件传递 (File Delivery) 消息直接注入到 A2A 聊天会话中的功能。
    修复注入 Bug:修复了在触发文件传递消息时导致的 DetachedInstanceError 报错,并修正了相关的 ChatMessage 和 ChatSession 导入路径。
    完善协议:在 A2A 注入阶段强化了 finish() 通信协议,并移除了系统提示词中不再需要的 Focus 注入机制。
    修复前端重影:在 AgentDetail.tsx 中正确处理了 agentbay_live 事件,防止聊天界面出现虚假的“幽灵用户气泡” (ghost user bubbles)。
    这个分支目前状态很干净,仅涉及上述三个模块的后端逻辑 (services/sandbox, services/llm, api/tools.py 等) 及对应的前端展示组件,没有任何对其他功能(如 SkillsTab)的多余改动,已经可以直接用于提 PR 了!

xiejiayu added 17 commits May 18, 2026 11:23
…ow up to 1h

- Remove hardcoded 60s timeout cap in _execute_code, _execute_code_legacy, and subprocess_backend
- Clamp timeout by sandbox_config.max_timeout (configurable up to 3600s)
- Update tool_seeder config_schema max values from 300 to 3600
- Update SandboxConfig Pydantic validation to allow up to 3600s
- Remove 'no network access commands' from execute_code tool description
- Guide agent to pip install missing packages instead of limiting to stdlib
… times out

- Modified local subprocess sandbox to capture and decode stdout/stderr after proc.kill()
- Legacy fallback also modified to append output before returning timeout error
- This provides agents with visibility into what ran successfully before the 60s/3600s cap
- Added explicit instructions to the timeout error message telling the agent it can retry with a higher 'timeout' parameter up to 3600s.
- This empowers the agent to proactively recover from timeouts for long-running scripts instead of halting.
- asyncio.create_subprocess_exec with proc.communicate() loses output when cancelled by asyncio.wait_for()
- Switched to manual async reading of proc.stdout/stderr streams to ensure partial output is retained after proc.kill()
- Add on_output callback chain: WebSocket → caller → execute_tool → subprocess
- Each stdout/stderr chunk is pushed as agentbay_live event (env: code)
- Frontend concatenates streaming chunks, auto-opens live panel
- Add Clear button to code output panel
- Remove redundant code live_preview from tool_call done events
…user bubbles

AgentDetail.tsx had no handler for 'agentbay_live' WebSocket events,
causing real-time code execution output to fall through to the catch-all
else branch, which created spurious user message bubbles.

Now properly handles code/desktop/browser streaming events with
live panel auto-focus, matching the Chat.tsx behavior.
- Track explicitly disabled tools (AgentTool.enabled=False) to prevent
  _always_tools bypass from re-adding them (core/feishu/channel tools)
- Add agentbay tools to SYNC_IS_DEFAULT_TOOL_NAMES so seeder corrects
  stale is_default=True from older deployments
- Add diagnostic logging: final tool list, assignment count, disabled
  count, and default_fallback count for debugging tool visibility
When an agent has any AgentTool assignments (tool panel has been
configured), only include tools with an explicit AgentTool(enabled=True)
record. Tools without any AgentTool record are no longer auto-included
via is_default fallback — they are only provided by _always_tools if
they are core system tools.

For brand-new agents with zero assignments, the old is_default behavior
is preserved so they get a reasonable starting tool set.

This fixes the issue where 32+ tools were being sent to the LLM despite
the user having configured the tool panel, because those tools had no
AgentTool records and fell through to is_default=True.
When a configured agent (has existing AgentTool assignments) loads its
tool panel, automatically create AgentTool records for any visible tool
that doesn't have one yet. Uses is_default as the initial enabled value.

This ensures the UI state and get_agent_tools_for_llm stay in sync:
both now rely on explicit AgentTool records instead of the implicit
is_default fallback. Without this, tools like send_message_to_agent
show as enabled in the UI but are skipped by the LLM tool loader.
flush() only sends SQL to the DB but doesn't persist the transaction.
The FastAPI get_db session doesn't auto-commit, so backfilled records
were being rolled back when the session closed.
Quick models (GPT-4o-mini etc.) often output plain text instead of
calling finish() in A2A consult mode. The existing A2A injection said
'Reply concisely' without mentioning finish(), causing the model to
skip it. Added explicit finish() mandate with rejection warning right
in the A2A injection block.
Completed focus items were reinforcing stale workflow patterns (e.g.
old send_message_to_agent-only flow) over updated soul.md instructions.
Agents can still query focus via list_focus_items tool on demand.
When send_file_to_agent copies a file to the target's inbox, it now
also inserts a ChatMessage into the A2A session. This means when
send_message_to_agent is called next, the target agent's conversation
history already contains the file delivery info with the exact path
to read_file, solving the problem where target agents had no awareness
that a file was delivered.
source_agent.creator_id was accessed after the ORM session closed,
causing the file delivery message injection to silently fail. Now
extracted as source_creator_id local variable inside the active session.
…y injection

ChatMessage is in app.models.audit, ChatSession is in app.models.chat_session.
The previous import from app.models.chat caused ModuleNotFoundError which
was silently caught, preventing file delivery messages from being injected.
@qet2211pm qet2211pm requested review from wisdomqin and yaojin3616 May 18, 2026 12:29
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f91571e426

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +2215 to +2219
# Configured agent: require explicit AgentTool record
if at is None:
# No assignment → not included (unless _always_tools adds it)
default_included_names.append(t.name)
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve default tools when assignments are partial

This logic treats len(assignments) > 0 as “fully configured” and drops every tool without an AgentTool row. In practice assignments are often partial (the UI toggles one tool at a time via PUT /tools/agents/{id}), and the /with-config loader path does not backfill missing rows, so after a single toggle many previously available default tools disappear from the LLM tool payload unexpectedly. That changes agent behavior immediately and can break existing workflows that rely on still-default tools.

Useful? React with 👍 / 👎.

Comment on lines 417 to +421
except asyncio.TimeoutError:
proc.kill()
await proc.communicate()
is_timeout = True

await asyncio.gather(task1, task2)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reap timed-out subprocesses after kill

On timeout this path calls proc.kill() and then returns without awaiting process termination (wait/communicate). Under repeated timeouts, unreaped children can accumulate as zombies and degrade long-running workers. The previous implementation awaited communicate() after kill; the streaming version still needs an explicit reap step before returning timeout results.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant