Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .github/workflows/integration-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,20 @@ jobs:
echo "Set variable: $var_name"
done

# Integration service URLs from secrets/variables
ADMS_URL=$(echo '${{ toJSON(secrets) }}' | jq -r '.CLOUD_SDK_ADMS_INTEGRATION_URL // empty')
if [ -z "$ADMS_URL" ]; then
ADMS_URL=$(echo '${{ toJSON(vars) }}' | jq -r '.CLOUD_SDK_ADMS_INTEGRATION_URL // empty')
fi
if [ -n "$ADMS_URL" ]; then
echo "CLOUD_SDK_ADMS_INTEGRATION_URL=$ADMS_URL" >> $GITHUB_ENV
echo "Set: CLOUD_SDK_ADMS_INTEGRATION_URL"
else
# Skip ADMS integration tests when HDM service credentials are not configured
echo "CLOUD_SDK_ADMS_SKIP_IF_UNAVAILABLE=true" >> $GITHUB_ENV
echo "ADMS service URL not configured — ADMS integration tests will be skipped"
fi

echo "Environment setup complete - automatically configured all CLOUD_SDK_CFG_* environment variables and secrets"

- name: Run integration tests
Expand Down
10 changes: 9 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,12 @@ mocks/

# Generated files
PULL_REQUEST.md
RELEASE.md

# macOS metadata
.DS_Store

# UCL provisioning artefacts (separate repo concern)
.ucl-provision/
src/sap_cloud_sdk/adms/ucl/
RELEASE.md
.env.adms
140 changes: 140 additions & 0 deletions docs/INTEGRATION_TESTS_ADMS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# DMS Integration Tests

End-to-end tests that verify the `sap_cloud_sdk.adms` module is correctly wired to a running **SAP Advanced Document Management (ADM / HDM)** server.

## Two modes

| Mode | When to use | What runs |
|---|---|---|
| **Local auto-start** | Day-to-day development | Starts `hdm/srv` via `mvn spring-boot:run` with H2 + security disabled |
| **External / BTP** | CI pipelines, acceptance tests | Points to a deployed ADM instance using real IAS credentials |

---

## Prerequisites

### Local mode
- Java 21 and Maven 3.9+ on `PATH`
- The `hdm` repo checked out at the same level as `cloud-sdk-python` (i.e. `../hdm`), **or** `CLOUD_SDK_HDM_DIR` set to its path
- No external services needed — H2 in-memory DB, mocked storage & virus scanner

### External / BTP mode
- A provisioned ADM instance
- IAS service binding credentials

---

## Running the tests

### Local mode (auto-starts HDM)

```bash
cd /path/to/cloud-sdk-python

# Run all integration tests — HDM will start automatically
.venv/bin/python -m pytest tests/adms/integration/ -m integration -v

# Skip if HDM can't start (e.g. Java not available in this env)
CLOUD_SDK_ADMS_SKIP_IF_UNAVAILABLE=true \
.venv/bin/python -m pytest tests/adms/integration/ -m integration -v
```

HDM startup takes ~30–60 seconds on first run. The server is kept alive for the entire pytest session and killed at the end.

### External / BTP mode

```bash
export CLOUD_SDK_ADMS_INTEGRATION_URL=https://your-adm.cfapps.eu20.hana.ondemand.com
export CLOUD_SDK_CFG_ADMS_DEFAULT_SERVICE_URL=$CLOUD_SDK_ADMS_INTEGRATION_URL
export CLOUD_SDK_CFG_ADMS_DEFAULT_IAS_URL=https://your-tenant.accounts.ondemand.com
export CLOUD_SDK_CFG_ADMS_DEFAULT_CLIENT_ID=...
export CLOUD_SDK_CFG_ADMS_DEFAULT_CLIENT_SECRET=...

.venv/bin/python -m pytest tests/adms/integration/ -m integration -v
```

### Run a specific test file

```bash
# Document lifecycle only
.venv/bin/python -m pytest tests/adms/integration/test_e2e_document_flow.py -m integration -v

# Async client only
.venv/bin/python -m pytest tests/adms/integration/test_e2e_async_flow.py -m integration -v

# SPII handler (no server needed — runs SpiiHandler logic directly)
.venv/bin/python -m pytest tests/adms/integration/test_e2e_spii_flow.py -m integration -v
```

### Run unit tests only (no server)

```bash
.venv/bin/python -m pytest tests/adms/unit/ -v
```

---

## Environment variables reference

| Variable | Default | Description |
|---|---|---|
| `CLOUD_SDK_ADMS_INTEGRATION_URL` | _(unset)_ | External ADM URL; if set, skips local HDM auto-start |
| `CLOUD_SDK_HDM_DIR` | `../hdm` | Path to the HDM repo root (local mode) |
| `CLOUD_SDK_HDM_PORT` | `18080` | Port for the locally started HDM server |
| `CLOUD_SDK_ADMS_SKIP_IF_UNAVAILABLE` | `false` | Skip (not fail) if the server cannot be reached |

---

## Test files

| File | What it tests |
|---|---|
| [conftest.py](conftest.py) | Session fixtures: start HDM, `AdmsClient`, `AsyncAdmsClient`, `bo_type_id` |
| [test_e2e_document_flow.py](test_e2e_document_flow.py) | Sync client: create → query → get → update → draft lifecycle → delete |
| [test_e2e_async_flow.py](test_e2e_async_flow.py) | Async client: same operations + concurrent creates |
| [test_e2e_spii_flow.py](test_e2e_spii_flow.py) | SPII handler: CONFIG_PENDING, READY, unassign, cert gate, validation |

---

## How the local HDM server is started

The `hdm_base_url` fixture in `conftest.py`:

1. Checks if `CLOUD_SDK_ADMS_INTEGRATION_URL` is set → use it directly
2. Checks if port 18080 is already open → re-use the running server
3. Otherwise runs:
```
mvn -pl srv spring-boot:run -q \
-Dserver.port=18080 \
-Dspring.security.enabled=false \
-Dadm.redis.enabled=false
```
4. Polls `/actuator/health` every 3 seconds, up to 120 seconds
5. At session teardown, sends `SIGTERM` to the process group

**Why `spring.security.enabled=false`**: HDM's integration tests use `MockMvc` which bypasses Spring Security. For real HTTP calls from Python, security must be disabled or mocked. In the default/H2 profile without IAS/XSUAA bindings, this is safe and consistent with the existing Java IT approach.

---

## What the tests verify

### `test_e2e_document_flow.py`
1. `CreateDocumentWithRelation` returns a valid `DocumentRelation` with embedded `Document`
2. `get_all()` with `$filter` returns the created relation
3. `get()` by primary key returns correct fields
4. Newly created document has `DocumentState = PENDING` (or CLEAN in fast-scan environments)
5. `get_download_url()` raises `ScanNotCleanError` when state is PENDING
6. `PATCH /Document(...)` updates name correctly
7. Draft flow: `create_draft → validate_draft → activate_draft` produces active entities
8. Draft discard: `create_draft → discard_draft` leaves no active entities
9. `delete()` + subsequent `get()` raises `DocumentNotFoundError`

### `test_e2e_async_flow.py`
- All of the above but via `AsyncAdmsClient` (httpx-based)
- Plus: 3 concurrent `create()` calls via `asyncio.gather` — verifies connection pooling and async correctness

### `test_e2e_spii_flow.py`
- `SpiiHandler` is exercised directly (no HTTP server needed)
- Full CONFIG_PENDING → READY → UNASSIGN tenant lifecycle
- Certificate verification gate blocks wrong CN
- Validation rejects malformed notification payloads
102 changes: 102 additions & 0 deletions docs/adms/patterns/delete_user_data_pattern.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
name: delete_user_data_pattern
version: "1.0"
description: >
Start a DELETE_USER_DATA job in SAP ADM for GDPR erasure compliance.
Replaces all audit-field references (created_by, changed_by) to the specified
user across all Document and DocumentRelation records.
Routes to AdminService — requires system-user (client_credentials) auth,
NOT user-OBO auth. This pattern must never be triggered by end-user interaction
without a confirmed deletion request workflow.

intent_keywords:
- delete user data
- gdpr erasure
- right to be forgotten
- anonymize user
- erase personal data
- delete user from documents
- gdpr request

required_apis:
- step: 1
id: confirm_deletion
api: "workflow_gate"
description: >
MANDATORY human-in-the-loop confirmation gate before starting erasure.
Never auto-trigger DELETE_USER_DATA without explicit user confirmation.
Log the confirmation event with timestamp and approver identity.
security: CRITICAL
depends_on: []

- step: 2
id: start_delete_job
api: "client.jobs.start_delete_user_data"
description: >
Submit a DELETE_USER_DATA job to **AdminService** (not DocumentService).
Must use service-to-service credentials (client_credentials grant —
do NOT use user_jwt for this call).
input_schema:
type: DeleteUserDataJobParameters
required_fields:
- user_id
output: "JobOutput (job_id, job_status=RUNNING)"
service_path: odata/v4/AdminService
auth_note: >
Use create_client("default") — NOT create_client("default", user_jwt=...).
AdminService enforces system-level authorization.
depends_on: [confirm_deletion]

- step: 3
id: poll_job_status
api: "client.jobs.get_status"
description: >
Poll using the job_id from step 2 until job_status.is_terminal() is True.
poll_interval_seconds: 15
max_polls: 20
terminal_check: "output.job_status.is_terminal()"
terminal_states:
- COMPLETED
- FAILED
- CANCELLED
depends_on: [start_delete_job]

- step: 4
id: audit_log_completion
api: "workflow_gate"
description: >
Write an audit log entry recording the completion (or failure) of the
erasure job, including job_id, user_id, timestamp, and final status.
Required for GDPR Article 17 compliance evidence.
depends_on: [poll_job_status]

validation_rules:
- rule: "user_id must be a non-empty string matching the IAS user principal"
field: user_id
- rule: "NEVER trigger this pattern without explicit confirmation from an authorized approver"
security: CRITICAL
- rule: "NEVER use user_jwt — AdminService requires system auth (client_credentials)"
security: true
- rule: "job completion MUST be audit-logged for GDPR Art. 17 compliance"
- rule: "This pattern must only be available to GDPR officers / admins in AMS policy"

error_handling:
- error: "job_status == FAILED"
action: >
Log failure with all details. Escalate to platform admin.
Do not silently fail — GDPR erasure failures are compliance incidents.
- error: HttpError on start
action: Do not retry automatically — log and escalate.
- error: "max_polls exceeded"
action: >
Log that the job is still running. Return job_id for manual follow-up.
Do not assume the erasure has completed.

use_cases:
- "GDPR Right to Erasure request workflow for a departing employee"
- "Data subject access request — delete user from all ADM document audit fields"
- "LangGraph workflow: GDPR deletion pipeline triggered by HR offboarding event"

compliance:
regulation: GDPR Article 17 (Right to Erasure)
evidence_required: true
requires_human_approval: true
79 changes: 79 additions & 0 deletions docs/adms/patterns/document_download_pattern.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
name: document_download_pattern
version: "1.0"
description: >
Download a document from SAP ADM via a secure time-limited presigned URL.
Enforces the virus scan gate — downloads are only permitted for CLEAN documents.
The presigned URL must NOT be cached and must be consumed immediately.

intent_keywords:
- download document
- get file
- retrieve attachment
- export document
- fetch document content
- open document

required_apis:
- step: 1
id: check_scan_state
api: "client.documents.get"
description: >
Fetch document metadata to inspect DocumentState before attempting download.
Abort if state is not CLEAN (PENDING / INFECTED / SCAN_FAILED are blocked).
input_schema:
required_fields:
- document_id
optional_fields:
- is_active_entity
output: Document (contains DocumentState)
depends_on: []

- step: 2
id: get_download_url
api: "client.documents.get_download_url"
description: >
Obtain a time-limited presigned download URL. This method enforces the
ScanStatus.CLEAN gate internally — raises ScanNotCleanError if not ready.
input_schema:
required_fields:
- document_relation_id
- doc_content_version_id
optional_fields:
- is_active_entity
output: "str — presigned URL (valid for a short time, do not cache)"
depends_on: [check_scan_state]

- step: 3
id: stream_to_caller
api: "external_http_get"
description: >
Stream the file bytes from the presigned URL to the caller.
Use streaming GET to avoid buffering large files in memory.
The SDK does not buffer the download — use requests.get(url, stream=True)
or httpx.AsyncClient.stream().
depends_on: [get_download_url]

validation_rules:
- rule: "DocumentState MUST equal CLEAN before presenting a download URL to the user"
security: true
- rule: "Presigned URL must NOT be stored in logs, databases, or chat history"
security: true
- rule: "doc_content_version_id must be a non-empty string"
field: doc_content_version_id
- rule: "Do not retry get_download_url — each call generates a new presigned URL; just use most recent"

error_handling:
- error: ScanNotCleanError
action: >
Inform user that the document is not yet available for download —
it may still be under virus scan (PENDING) or blocked (INFECTED / SCAN_FAILED).
- error: DocumentNotFoundError
action: Surface to user — the document was deleted or the ID is wrong.
- error: HttpError
action: Log and retry once; surface persistent failures to user.

use_cases:
- "User requests to open an invoice attached to a Purchase Order"
- "Batch export of all documents linked to a Contract"
- "LangGraph node: retrieve document content for further AI processing"
- "Streaming large CAD drawings from ADM to the browser"
Loading
Loading