Skip to content

Governance hooks: policy enforcement and audit trails for tool calls #587

@imran-siddique

Description

@imran-siddique

Proposal: Governance Hooks for Claude Agent SDK

Summary

Add optional governance hooks to the Claude Agent SDK that enable policy enforcement, threat detection, and audit trails for tool calls — similar to what guardrails do for model outputs, but at the tool execution layer.

Problem

When building multi-agent systems with the Claude Agent SDK, there is no built-in mechanism to:

  1. Enforce tool-level policies (which tools can be called, with what arguments, how often)
  2. Detect threat patterns in tool arguments before execution (data exfiltration, privilege escalation)
  3. Score trust levels between agents for safe delegation
  4. Generate immutable audit trails of all tool calls and policy decisions

Proposed Design

from claude_agent_sdk import Agent, GovernancePolicy

# Define governance policy
policy = GovernancePolicy(
    name="production-safe",
    allowed_tools=["search", "read_file", "write_file"],
    blocked_tools=["execute_shell", "delete_file"],
    max_tool_calls=50,
    content_filters=["no_pii", "no_secrets"],
    threat_detection=True,
    audit_trail=True,
)

# Apply to agent
agent = Agent(
    name="research-agent",
    tools=[search, read_file, write_file],
    governance=policy,  # Governance hooks intercept tool calls
)

Hook Points

  1. before_tool_call — Validate tool name and arguments against policy. Block if disallowed.
  2. after_tool_call — Audit the result, check for sensitive data leakage.
  3. on_delegation — When Agent A delegates to Agent B, verify trust score thresholds.
  4. on_policy_violation — Callback when a policy rule is triggered (for alerting/monitoring).

Why Not External Middleware?

External governance works but loses context:

  • SDK-level hooks have access to the full agent context (conversation history, tool results, delegation chain)
  • Hook execution order can be guaranteed (governance before business logic)
  • Audit trails can capture the complete agent decision trace, not just tool I/O

Prior Art

  • OpenAI Agents SDK has guardrails (input/output guardrails on model responses)
  • Google ADK has BasePlugin with before_tool_callback / after_tool_callback
  • PydanticAI has middleware proposals (#2885)
  • We've built governance integrations for all of these: agentmesh-integrations

Context

We maintain Agent-OS and have filed similar governance proposals across the agent ecosystem:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions