Skip to content

feat(#2489): humanApproval — sealed HumanDecision + audit events on t…#65

Merged
Skobeltsyn merged 1 commit into
mainfrom
feat/2489-human-approval
May 30, 2026
Merged

feat(#2489): humanApproval — sealed HumanDecision + audit events on t…#65
Skobeltsyn merged 1 commit into
mainfrom
feat/2489-human-approval

Conversation

@Skobeltsyn
Copy link
Copy Markdown
Contributor

…op of #2488 interrupt

#2489 — second child of the HITL epic (#2487). Promotes the typed approval pattern from the #1918 demo to a runtime feature, layered on the #2488 interrupt primitive.

tool("approve_deploy") { args ->
    humanApproval {
        title = "Deploy to production?"
        body = deploymentPlan          // typed, @Generable or anything
        timeout = 30.minutes
        defaultOnTimeout = HumanDecision.Rejected
    }
    // throws AgentInterruptException carrying ApprovalRequest; resume
    // with one of the four HumanDecision variants
}

Implementation:

  • core/HumanApproval.kt — new file. ApprovalRequest(title, body, timeout, defaultOnTimeout). Sealed HumanDecision { Approved, Rejected, Edited(payload), Responded(payload) }. ApprovalBuilder with fail-fast on blank title. Free function humanApproval { } : Nothing builds the request and calls interrupt(payload = request).
  • core/PipelineEvent.kt — two new variants:
    • ApprovalRequested(title, hasBody, timeoutMs, ...) — fires BEFORE the throw when the runtime detects an ApprovalRequest payload. Field-only: title + body-presence + advisory timeout. No body in the audit row (high-volume / PII-sensitive).
    • ApprovalDecided(decision, hasPayload, ...) — fires on resume when resumeWith is a HumanDecision. decision is the variant name; hasPayload flags whether Edited/Responded carried one. Payload itself stays off the audit row. Both wired through Agent.observe { } so JSONL audit + OTel / LangSmith / Langfuse bridges pick them up. Bridge when blocks updated to handle both variants (field-only, mirroring the source).
  • core/Agent.kt — new approvalRequestedListener + approvalDecidedListener listener slots, with onApprovalRequested / onApprovalDecided public DSL setters. Mirror the existing onToolHallucinated pattern.
  • model/AgenticLoop.kt:
    • In the PendingInterruptSignal catch (#2488), if the payload is an ApprovalRequest, fire approvalRequestedListener under the runtime context.
    • In the resume entry (#2488), if resumeWith is HumanDecision, fire approvalDecidedListener with the variant name + payload presence before synthesising the tool result.

Composition:

  • Builds entirely on #2488 interrupt — no new state, no new exception type. humanApproval is sugar for interrupt(ApprovalRequest(...)).
  • Manifest-hash restore guard (#2754) applies — pinned by a dedicated test.
  • Resume path uses the existing resumeWith -> toLlmInput -> synthesised tool message pipeline (#2488).
  • Timeout is advisory; the caller honors it (the human reply happens between catch and the next invokeSuspendResuming call, outside any runtime suspension). defaultOnTimeout = Rejected is the fail-closed default for a regulated runtime.

Tests (HumanApprovalTest.kt — 10 cases):

  • ApprovalRequest payload round-trips on AgentInterruptException
  • HumanDecision.Approved resumes to text completion
  • HumanDecision.Rejected — synthesised tool message reflects the decision
  • HumanDecision.Edited carries a typed @generable payload
  • HumanDecision.Responded carries a free-form payload
  • ApprovalRequested PipelineEvent fires with field-only audit row
  • ApprovalDecided PipelineEvent fires on resume with HumanDecision
  • ApprovalDecided does NOT fire when resumeWith is a raw value (gating is type-driven; ApprovalRequested also gated on payload type)
  • Manifest-hash mismatch refuses to resume the approval snapshot
  • Blank title fails fast at the builder before interrupt is thrown

Full suite: 1757 tests across 7 modules, 0 failures.

…op of #2488 interrupt

#2489 — second child of the HITL epic (#2487). Promotes the typed
approval pattern from the #1918 demo to a runtime feature, layered on
the #2488 interrupt primitive.

```kotlin
tool("approve_deploy") { args ->
    humanApproval {
        title = "Deploy to production?"
        body = deploymentPlan          // typed, @generable or anything
        timeout = 30.minutes
        defaultOnTimeout = HumanDecision.Rejected
    }
    // throws AgentInterruptException carrying ApprovalRequest; resume
    // with one of the four HumanDecision variants
}
```

Implementation:

- core/HumanApproval.kt — new file. `ApprovalRequest(title, body,
  timeout, defaultOnTimeout)`. Sealed `HumanDecision { Approved,
  Rejected, Edited(payload), Responded(payload) }`. `ApprovalBuilder`
  with fail-fast on blank title. Free function `humanApproval { } :
  Nothing` builds the request and calls `interrupt(payload = request)`.
- core/PipelineEvent.kt — two new variants:
  * `ApprovalRequested(title, hasBody, timeoutMs, ...)` — fires
    BEFORE the throw when the runtime detects an `ApprovalRequest`
    payload. Field-only: title + body-presence + advisory timeout. No
    body in the audit row (high-volume / PII-sensitive).
  * `ApprovalDecided(decision, hasPayload, ...)` — fires on resume
    when `resumeWith` is a `HumanDecision`. `decision` is the variant
    name; `hasPayload` flags whether Edited/Responded carried one.
    Payload itself stays off the audit row.
  Both wired through `Agent.observe { }` so JSONL audit + OTel /
  LangSmith / Langfuse bridges pick them up. Bridge `when` blocks
  updated to handle both variants (field-only, mirroring the source).
- core/Agent.kt — new `approvalRequestedListener` +
  `approvalDecidedListener` listener slots, with `onApprovalRequested`
  / `onApprovalDecided` public DSL setters. Mirror the existing
  `onToolHallucinated` pattern.
- model/AgenticLoop.kt:
  * In the `PendingInterruptSignal` catch (#2488), if the payload is
    an `ApprovalRequest`, fire `approvalRequestedListener` under the
    runtime context.
  * In the resume entry (#2488), if `resumeWith is HumanDecision`,
    fire `approvalDecidedListener` with the variant name + payload
    presence before synthesising the tool result.

Composition:
- Builds entirely on #2488 interrupt — no new state, no new exception
  type. `humanApproval` is sugar for `interrupt(ApprovalRequest(...))`.
- Manifest-hash restore guard (#2754) applies — pinned by a dedicated
  test.
- Resume path uses the existing `resumeWith` -> `toLlmInput` ->
  synthesised tool message pipeline (#2488).
- Timeout is advisory; the caller honors it (the human reply happens
  between catch and the next `invokeSuspendResuming` call, outside
  any runtime suspension). `defaultOnTimeout = Rejected` is the
  fail-closed default for a regulated runtime.

Tests (HumanApprovalTest.kt — 10 cases):
- ApprovalRequest payload round-trips on AgentInterruptException
- HumanDecision.Approved resumes to text completion
- HumanDecision.Rejected — synthesised tool message reflects the decision
- HumanDecision.Edited carries a typed @generable payload
- HumanDecision.Responded carries a free-form payload
- ApprovalRequested PipelineEvent fires with field-only audit row
- ApprovalDecided PipelineEvent fires on resume with HumanDecision
- ApprovalDecided does NOT fire when resumeWith is a raw value
  (gating is type-driven; ApprovalRequested also gated on payload type)
- Manifest-hash mismatch refuses to resume the approval snapshot
- Blank title fails fast at the builder before interrupt is thrown

Full suite: 1757 tests across 7 modules, 0 failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Skobeltsyn Skobeltsyn merged commit 5c66808 into main May 30, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant