Skip to content

feat(cdk): integ-tests Phase 1 — core lifecycle E2E #317

@ayushtr-aws

Description

@ayushtr-aws

Component

CDK / infrastructure, API or orchestration, Tooling / CI

Describe the feature

Phase 1 (Core lifecycle) of the deploy-then-verify integration-test effort started in #236. Phase 0 (foundation: @aws-cdk/integ-tests-alpha, cdk/test/integ/ smoke test, mise //cdk:integ entry point) was delivered in #295. This issue covers the core task-lifecycle scenarios on a live stack.

Parent: #236 (Phase 0). Follow-up: Phase 2 (channels & guardrails).

Use case

  • Confidence before merge: validate the full submit→terminal lifecycle on a deployed stack — orchestration, admission, hydration, agent session lifecycle — that mocks cannot surface.
  • Encode the Cedar HITL contract: turn the manual E2E matrix (scenarios A–E in docs/design/CEDAR_HITL_GATES.md §15.3) into repeatable assertions.

Proposed solution

  • Integ scenarios aligned with the manual Cedar HITL E2E matrix where feasible:
    • submit → run → complete
    • submit → run → fail (terminal error path)
    • submit → run → await approvalapprove → terminal
    • submit → run → await approvaldeny → terminal
  • Use waitForAssertions() for long-running agent/orchestrator paths with conservative timeouts.
  • Assert task record fields and event shapes at each terminal state (task_id, user_id, status, timestamps, approval metadata).
  • Ensure teardown (destroy --force) runs on both success and failure.

Design constraints

Run policy follows the model established by Phase 0 (.github/workflows/integ.yml):

  • When it runs: per-PR via workflow_run (triggered after a successful build), path-filtered to PRs touching cdk/** or agent/**; plus on-demand workflow_dispatch (restricted to main). Nightly schedule was intentionally dropped — per-PR + manual dispatch is the agreed coverage.
  • Why workflow_run: lets fork PRs run against the shared account (a fork pull_request job gets no secrets/OIDC). Mitigated by a build-success guard, path filter, the integ environment approval gate (an admin reviews fork test code before it runs with the privileged role), a least-privilege role, and status-only tokens per job.
  • Gate / required check: an admin approves the integ environment, then deploy→assert→destroy runs and posts an integ-smoke commit status back to the PR head as a required check that blocks merge. Docs/CLI-only PRs get an immediate green (skipped) status so the required check never deadlocks.
  • Concurrency: single cdk-integ group, cancel-in-progress: false — only one run at a time against the shared account (the hardcoded backgroundagent-integ stack name would otherwise collide).
  • Stack isolation: dedicated backgroundagent-integ stack name (separate from developer backgroundagent-dev stacks); integ apps kept separate from production synth (cdk.out isolation); assertion stacks use DeployAssert.
  • Local dev path: unchanged — run mise //cdk:integ with your own AWS creds.

Other information

Acknowledgements

  • I may be able to implement this feature
  • This might be a breaking change

Metadata

Metadata

Assignees

Labels

P1Priority 1 — high priorityapprovedWhen an issue has been approved and readyenhancementNew feature or requestinfra-cdkCDK stacks/constructs, bootstrap, deploy topology, tags, IAM wiring, teardownvalidation-loopTasks related to improve the validation loop for ABCA's codebase

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions