Skip to content

Feat/agent api calling#579

Open
qet2211pm wants to merge 31 commits into
mainfrom
feat/agent-api-calling
Open

Feat/agent api calling#579
qet2211pm wants to merge 31 commits into
mainfrom
feat/agent-api-calling

Conversation

@qet2211pm
Copy link
Copy Markdown

已生成完整的项目修改总结,详见
recent_changes_summary.md

核心概览 — 自 v1.9.3 以来,约 40 个 commit、68 个文件、净增 5,500+ 行,主要集中在 7 大领域:

Agent API 认证 — 全新的 token_key 认证体系,让 Agent 可被外部系统直接调用
A2A 协作通信 — 文件投递自动注入聊天记录 + finish 协议闭环
沙箱引擎 — 超时可配置(最大 1h),超时后仍返回已有输出
实时流式推送 — 代码执行输出实时推到前端右侧面板
Onboarding + OAuth — 完整用户引导流程 + OAuth 登录 + 私人助理模板
工具管理 — 修复 Onboarding 阶段工具丢失 bug,尊重用户的启用/禁用配置
安全权限 — 文件删除限管理员、禁止 heredoc 调试
recent_changes_summary.md

xiejiayu added 30 commits May 12, 2026 14:48
- Add POST /api/v1/agent/chat synchronous endpoint
- Per-agent TokenKey (clw_xxx) generation on creation
- Relationship enforcement: caller must have AgentAgentRelationship with target
- Activity logging for API calls (no ChatMessage persistence)
- System prompt injection of TokenKey + API usage instructions
- Frontend Settings tab: masked key display, copy, regenerate
- Token Key management endpoints: GET/POST token-key, regenerate
- DB migration: add token_key + token_key_suffix to agents table
- 30 unit tests covering auth, relationships, error handling, schemas
- Remove Copy button and get_token_key frontend call
- Regenerate endpoint no longer returns full key (only suffix)
- Add Generate Key button for agents without token_key
- Update description: key is auto-injected via System Prompt
- New skill: teaches agents how to find target IDs and write API calling code
- relationships.md now includes Agent ID for each digital employee colleague
- Enables agents to programmatically invoke peers via execute_code
Auto-pushed to all agent workspaces on startup, same as mcp-installer.
SKILL.md loaded from agent_template/skills/ at runtime.
Added [LLM-Debug] level debug logs for tool definitions sent to:
- OpenAICompatibleClient
- OpenAIResponsesClient
- AnthropicClient

Set LOG_LEVEL to DEBUG in logging_config.py to see full prompt + tools payloads.
…odels

When reasoning models (o3, o4-mini) exhaust their token budget on
internal thinking, the Responses API returns 'model output must contain
either output text or tool calls, these cannot both be empty'.

This fix:
1. OpenAIResponsesClient.complete() now retries once with doubled
   max_output_tokens (min 32768) on this specific error.
2. failover.py classifies 'cannot both be empty' as RETRYABLE so
   the fallback model can also be attempted.
Root cause: call_llm uses an internal _buffer_chunk (no-op) instead of the
websocket's stream_to_ws callback. When the greeting turn model only calls
finish, neither on_chunk nor on_tool_call fires, so
maybe_mark_onboarding_progress is never invoked and the onboarding phase
stays at its initial state. Every subsequent turn then re-triggers the
greeting flow with skip_tools=True, leaving the agent with only the finish
tool.

Fix: call maybe_mark_onboarding_progress unconditionally after the LLM task
returns. The internal onboarding_mark_done guard prevents double execution.

Also:
- Downgrade _get_tool_config missing-config log from ERROR to DEBUG (expected
  for agents without AgentBay configured)
- Add INFO-level tool count logging in caller.py for easier debugging
…ow up to 1h

- Remove hardcoded 60s timeout cap in _execute_code, _execute_code_legacy, and subprocess_backend
- Clamp timeout by sandbox_config.max_timeout (configurable up to 3600s)
- Update tool_seeder config_schema max values from 300 to 3600
- Update SandboxConfig Pydantic validation to allow up to 3600s
- Remove 'no network access commands' from execute_code tool description
- Guide agent to pip install missing packages instead of limiting to stdlib
… times out

- Modified local subprocess sandbox to capture and decode stdout/stderr after proc.kill()
- Legacy fallback also modified to append output before returning timeout error
- This provides agents with visibility into what ran successfully before the 60s/3600s cap
- Added explicit instructions to the timeout error message telling the agent it can retry with a higher 'timeout' parameter up to 3600s.
- This empowers the agent to proactively recover from timeouts for long-running scripts instead of halting.
- asyncio.create_subprocess_exec with proc.communicate() loses output when cancelled by asyncio.wait_for()
- Switched to manual async reading of proc.stdout/stderr streams to ensure partial output is retained after proc.kill()
- Added a specific rule forcing the agent to modify the actual python files to fix bugs, rather than extracting code and running it inline via a heredoc in execute_code.
- Adapted the robust tool constraints from Auto-ml tools.py into Clawith's agent_tools.py.
- write_file: Added explicit warning to use edit_file for partial modifications and bugs.
- edit_file: Emphasized that old_string MUST exactly match existing file content including whitespace.
- read_file/edit_file: Added CRITICAL RULE instructing the agent to always use read_file first before editing to ensure precise matching.
- Add on_output callback chain: WebSocket → caller → execute_tool → subprocess
- Each stdout/stderr chunk is pushed as agentbay_live event (env: code)
- Frontend concatenates streaming chunks, auto-opens live panel
- Add Clear button to code output panel
- Remove redundant code live_preview from tool_call done events
…user bubbles

AgentDetail.tsx had no handler for 'agentbay_live' WebSocket events,
causing real-time code execution output to fall through to the catch-all
else branch, which created spurious user message bubbles.

Now properly handles code/desktop/browser streaming events with
live panel auto-focus, matching the Chat.tsx behavior.
- Track explicitly disabled tools (AgentTool.enabled=False) to prevent
  _always_tools bypass from re-adding them (core/feishu/channel tools)
- Add agentbay tools to SYNC_IS_DEFAULT_TOOL_NAMES so seeder corrects
  stale is_default=True from older deployments
- Add diagnostic logging: final tool list, assignment count, disabled
  count, and default_fallback count for debugging tool visibility
When an agent has any AgentTool assignments (tool panel has been
configured), only include tools with an explicit AgentTool(enabled=True)
record. Tools without any AgentTool record are no longer auto-included
via is_default fallback — they are only provided by _always_tools if
they are core system tools.

For brand-new agents with zero assignments, the old is_default behavior
is preserved so they get a reasonable starting tool set.

This fixes the issue where 32+ tools were being sent to the LLM despite
the user having configured the tool panel, because those tools had no
AgentTool records and fell through to is_default=True.
When a configured agent (has existing AgentTool assignments) loads its
tool panel, automatically create AgentTool records for any visible tool
that doesn't have one yet. Uses is_default as the initial enabled value.

This ensures the UI state and get_agent_tools_for_llm stay in sync:
both now rely on explicit AgentTool records instead of the implicit
is_default fallback. Without this, tools like send_message_to_agent
show as enabled in the UI but are skipped by the LLM tool loader.
flush() only sends SQL to the DB but doesn't persist the transaction.
The FastAPI get_db session doesn't auto-commit, so backfilled records
were being rolled back when the session closed.
…nd A2A flows

Both flows now emit DEBUG-level logs after each LLM round:
- [LLM] Round N response: content, tool_calls names, reasoning (truncated)
- [A2A] Round N response for <agent>: same fields

Content truncated to 500 chars, reasoning to 300 chars.
Quick models (GPT-4o-mini etc.) often output plain text instead of
calling finish() in A2A consult mode. The existing A2A injection said
'Reply concisely' without mentioning finish(), causing the model to
skip it. Added explicit finish() mandate with rejection warning right
in the A2A injection block.
Completed focus items were reinforcing stale workflow patterns (e.g.
old send_message_to_agent-only flow) over updated soul.md instructions.
Agents can still query focus via list_focus_items tool on demand.
When send_file_to_agent copies a file to the target's inbox, it now
also inserts a ChatMessage into the A2A session. This means when
send_message_to_agent is called next, the target agent's conversation
history already contains the file delivery info with the exact path
to read_file, solving the problem where target agents had no awareness
that a file was delivered.
source_agent.creator_id was accessed after the ORM session closed,
causing the file delivery message injection to silently fail. Now
extracted as source_creator_id local variable inside the active session.
…y injection

ChatMessage is in app.models.audit, ChatSession is in app.models.chat_session.
The previous import from app.models.chat caused ModuleNotFoundError which
was silently caught, preventing file delivery messages from being injected.
@qet2211pm qet2211pm requested a review from yaojin3616 May 15, 2026 05:58
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 446d635df7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


# revision identifiers
revision = "add_agent_token_key"
down_revision = None # Will be set by Alembic chain
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Chain new Alembic revision to existing migration head

Set down_revision to the current head instead of None; this file currently creates a second base revision, so environments already stamped to the existing chain will not have a valid upgrade path to this migration. In practice this can block alembic upgrade in deployed databases until the revision graph is manually repaired.

Useful? React with 👍 / 👎.

messages=messages,
agent_name=target.name,
role_description=target.role_description or "",
agent_id=target.id,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Charge Agent API token usage to the caller agent

This endpoint documents that API usage should be billed to the token owner, but it calls call_llm with agent_id=target.id, and call_llm records usage by agent_id. As a result, cross-agent API calls debit the callee's quota/usage instead of the caller's, allowing callers to consume another agent's budget and skewing usage accounting.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant