HITL Tool Execution Gap: Tools Declared in Planning Phase Not Invoked During Execution
Problem Summary
The ART framework has a fundamental architectural gap between how it instructs the LLM during planning and how it expects the LLM to behave during execution. The core issue is a prompt-behavior disconnect: when the planning phase creates a todo list that includes steps like "Execute webSearch tool to find data", the execution phase has no enforcement mechanism to ensure that tool is actually called.
Symptoms
- Steps like "Execute webSearch" complete successfully without any tools being invoked
- HITL blocking tools (e.g.,
displayConfirmation) never trigger because the LLM describes what it would do instead of calling the tool
- The agent marks steps as COMPLETED regardless of whether declared tools were actually invoked
Root Cause
- Planning Phase: LLM writes natural language descriptions but doesn't declare tool requirements
- Execution Phase: LLM "executes" steps by optionally calling tools, with no validation
- Completion: Items are marked COMPLETED when LLM returns output, not when tools are called
Proposed Solution: Tool-Aware Execution Framework (TAEF)
Key Changes
-
Step Type Classification: Explicitly classify steps as tool (requires external invocation) or reasoning (LLM analysis only)
-
Required Tools Declaration: Planning phase declares which tools MUST be called for each tool step:
{
"id": "step_1",
"description": "Search for weather data",
"stepType": "tool",
"requiredTools": ["webSearch"],
"expectedOutcome": "Retrieved weather statistics"
}
-
Conditional Validation: Only enforce tool invocation on steps that declare requiredTools - reasoning steps skip validation
-
Retry Mechanism: For strict validation mode, re-prompt the LLM if required tools weren't invoked
-
Backward Compatibility: Items without stepType default to reasoning behavior (no validation)
Files Affected
| File |
Changes |
src/types/pes-types.ts |
Add stepType, requiredTools, expectedOutcome, toolValidationMode, validationStatus, actualToolCalls to TodoItem |
src/systems/reasoning/OutputParser.ts |
Parse new fields from planning output |
src/core/agents/pes-agent.ts |
Enhanced planning prompt, step-type-aware execution prompts, validation logic |
Related Documents
Acceptance Criteria
Labels
enhancement hitl agent-governance priority:high
HITL Tool Execution Gap: Tools Declared in Planning Phase Not Invoked During Execution
Problem Summary
The ART framework has a fundamental architectural gap between how it instructs the LLM during planning and how it expects the LLM to behave during execution. The core issue is a prompt-behavior disconnect: when the planning phase creates a todo list that includes steps like "Execute webSearch tool to find data", the execution phase has no enforcement mechanism to ensure that tool is actually called.
Symptoms
displayConfirmation) never trigger because the LLM describes what it would do instead of calling the toolRoot Cause
Proposed Solution: Tool-Aware Execution Framework (TAEF)
Key Changes
Step Type Classification: Explicitly classify steps as
tool(requires external invocation) orreasoning(LLM analysis only)Required Tools Declaration: Planning phase declares which tools MUST be called for each tool step:
Conditional Validation: Only enforce tool invocation on steps that declare
requiredTools- reasoning steps skip validationRetry Mechanism: For strict validation mode, re-prompt the LLM if required tools weren't invoked
Backward Compatibility: Items without
stepTypedefault to reasoning behavior (no validation)Files Affected
src/types/pes-types.tsstepType,requiredTools,expectedOutcome,toolValidationMode,validationStatus,actualToolCallsto TodoItemsrc/systems/reasoning/OutputParser.tssrc/core/agents/pes-agent.tsRelated Documents
Acceptance Criteria
toolorreasoningrequiredToolsfor tool stepsLabels
enhancementhitlagent-governancepriority:high