Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
63d7597
PDX-485: chore(mcp) — introduce shared warningCodes.ts enum for cross…
mrdailey99 May 19, 2026
d033519
PDX-493: feat(mcp) — emit valueClass=date|datetime|boolean|integer vi…
mrdailey99 May 19, 2026
2b67b0a
PDX-486: feat(mcp) — strict validator unknown-key detection (SCHEMA-0…
mrdailey99 May 19, 2026
e78683a
PDX-490: feat(mcp) — add error_category and retryable fields to test-…
mrdailey99 May 19, 2026
67f3653
PDX-486: feat(mcp) — testrun zero-tests guard (RUN-001 warning)
mrdailey99 May 19, 2026
583a1d9
PDX-489: feat(mcp) — warn on dataTable used in direct testCase-mode (…
mrdailey99 May 19, 2026
46f691e
PDX-485: test(mcp) — add exact key/value set assertions to WARNING_CO…
mrdailey99 May 19, 2026
9a6b784
PDX-492: feat(mcp) — add provar_org_describe tool (reads workspace .m…
mrdailey99 May 19, 2026
a8c05fc
PDX-486: docs(mcp) — clarify that WARNING [code] shape applies only t…
mrdailey99 May 20, 2026
669b517
PDX-490: docs(mcp) — clarify error_category scope (rca mode only) + f…
mrdailey99 May 20, 2026
8def6af
PDX-493: fix(mcp) — align numeric valueClass with canonical reference…
mrdailey99 May 20, 2026
e46b7a9
PDX-489: fix(mcp) — enforce assertPathAllowed on propertiesFilePathOv…
mrdailey99 May 20, 2026
3ca9379
PDX-486: fix(mcp) — gate RUN-001 on parsedAny to avoid false positive…
mrdailey99 May 20, 2026
f04eeff
PDX-492: fix(mcp) — orgDescribe path-policy hardening, XML required-f…
mrdailey99 May 20, 2026
dbe4f37
PDX-489: fix(test) — realpath tmp root in plan-mode test to match sym…
mrdailey99 May 20, 2026
4700560
Merge pull request #182 from ProvarTesting/chore/PDX-485-warning-code…
mrdailey99 May 20, 2026
dd4335f
Merge pull request #184 from ProvarTesting/fix/PDX-486-validate-typo-…
mrdailey99 May 20, 2026
c08bc72
Merge pull request #186 from ProvarTesting/feature/PDX-490-error-cate…
mrdailey99 May 20, 2026
00423b9
Merge pull request #187 from ProvarTesting/feature/PDX-489-datatable-…
mrdailey99 May 20, 2026
d3436f6
Merge branch 'develop' into fix/PDX-486-validate-typo-b-zero-tests-guard
mrdailey99 May 20, 2026
3e4ef29
Merge pull request #185 from ProvarTesting/fix/PDX-486-validate-typo-…
mrdailey99 May 20, 2026
7e44a2c
Merge pull request #183 from ProvarTesting/feature/PDX-493-date-datet…
mrdailey99 May 20, 2026
cf8e35e
Merge branch 'develop' into feature/PDX-492-org-describe-tool
mrdailey99 May 20, 2026
39c3b92
Merge pull request #188 from ProvarTesting/feature/PDX-492-org-descri…
mrdailey99 May 20, 2026
ad85347
PDX-0: chore(release) — bump version to 1.5.2-beta.1 on develop
mrdailey99 May 20, 2026
38d5504
Merge pull request #189 from ProvarTesting/chore/version-bump-1.5.2-b…
mrdailey99 May 20, 2026
94e1c8a
PDX-0: chore(release) — bump version to 1.5.2 for release
mrdailey99 May 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions docs/mcp-pilot-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -483,6 +483,37 @@ If any FAIL indicator appears, report it to the Provar team with the prompt and

---

### Scenario 13: Org-Aware Test Case Generation

The `provar_org_describe` tool surfaces cached Salesforce describe data from the Provar IDE workspace so the agent knows which fields on each object are required and what their types are — without making a live Salesforce API call. Use it as a hint source before generating data-heavy steps.

**Prerequisite:** the project must have been opened in Provar IDE at least once with the named connection loaded, so the `.metadata/<connection_name>/` directory is populated.

**Try this prompt:**

> "Before generating a test case that creates an Account, call `provar_org_describe` against my project at `/path/to/MyProject` for connection `MyOrg` and the `Account` object only. Use the required-field list to populate the create form."

The tool returns the discovered workspace path, a cache age, and per-object required-field metadata. Example call:

```jsonc
{
"project_path": "/Users/you/git/MyProject",
"connection_name": "MyOrg",
"objects": ["Account"],
"field_filter": "required"
}
```

**What to look for (PASS):**

- Response includes `workspace_path` resolved to a real `workspace-*` directory.
- `objects[0].required_fields` contains at least one field with `nillable: false`.
- The follow-up `provar_testcase_generate` call uses field names from the response.

**Cache-miss behaviour (also PASS):** if the cache directory does not exist the tool returns `details.suggestion` telling the agent how to recover — either open the project in Provar IDE to populate the cache, or pass field-type hints inline.

---

## Security Model

### What the server does
Expand Down
283 changes: 249 additions & 34 deletions docs/mcp.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "@provartesting/provardx-cli",
"description": "A plugin for the Salesforce CLI to orchestrate testing activities and report quality metrics to Provar Quality Hub",
"version": "1.5.1",
"version": "1.5.2",
"mcpName": "io.github.ProvarTesting/provar",
"license": "BSD-3-Clause",
"plugins": [
Expand Down
30 changes: 30 additions & 0 deletions scripts/mcp-smoke.cjs
Original file line number Diff line number Diff line change
Expand Up @@ -461,6 +461,36 @@ async function runTests() {
test_item_id: '1',
});

// ── 54. provar_org_describe — cache miss ─────────────────────────────────
// TMP has no workspace at all → cache-miss response with details.suggestion
if (inGroup('inspect'))
await callTool('provar_org_describe', {
project_path: TMP,
connection_name: 'SmokeOrg',
objects: ['Account'],
});

// ── 55. provar_org_describe — happy path ─────────────────────────────────
// Set up a sibling workspace + .metadata/<connection> with one fake object.
if (inGroup('inspect')) {
const fs = require('fs');
const orgProject = path.join(TMP, 'org-describe-smoke-project');
fs.mkdirSync(orgProject, { recursive: true });
const cxnDir = path.join(TMP, 'workspace-org-describe-smoke-project', '.metadata', 'SmokeOrg');
fs.mkdirSync(cxnDir, { recursive: true });
fs.writeFileSync(
path.join(cxnDir, 'Account.json'),
JSON.stringify({
name: 'Account',
fields: [{ name: 'Name', type: 'string', defaultValue: null, nillable: false }],
})
);
await callTool('provar_org_describe', {
project_path: orgProject,
connection_name: 'SmokeOrg',
});
}

server.stdin.end();
}

Expand Down
4 changes: 2 additions & 2 deletions server.json
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,12 @@
"url": "https://github.com/ProvarTesting/provardx-cli",
"source": "github"
},
"version": "1.5.1",
"version": "1.5.2",
"packages": [
{
"registryType": "npm",
"identifier": "@provartesting/provardx-cli",
"version": "1.5.1",
"version": "1.5.2",
"transport": {
"type": "stdio"
},
Expand Down
3 changes: 2 additions & 1 deletion src/mcp/server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ import { registerAllTestPlanTools } from './tools/testPlanTools.js';
import { registerAllNitroXTools } from './tools/nitroXTools.js';
import { registerAllTestCaseStepTools } from './tools/testCaseStepTools.js';
import { registerAllConnectionTools } from './tools/connectionTools.js';
import { registerAllOrgDescribeTools } from './tools/orgDescribeTools.js';
import { registerAllPrompts } from './prompts/index.js';
import {
createDepthGuardState,
Expand Down Expand Up @@ -64,7 +65,7 @@ const TOOL_GROUPS: Record<string, Array<(server: McpServer, config: ServerConfig
registerAllTestCaseStepTools,
registerAllTestPlanTools,
],
inspect: [registerProjectInspect],
inspect: [registerProjectInspect, registerAllOrgDescribeTools],
connection: [registerAllConnectionTools],
rca: [registerAllRcaTools],
};
Expand Down
59 changes: 56 additions & 3 deletions src/mcp/tools/antTools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -979,16 +979,57 @@ function finalizeAnt(

// ── JUnit XML step parsing ────────────────────────────────────────────────────

export type JUnitErrorCategory = 'INFRASTRUCTURE' | 'ASSERTION' | 'LOCATOR' | 'TIMEOUT' | 'OTHER';

export interface JUnitStepResult {
testItemId: string;
title: string;
status: 'pass' | 'fail' | 'skip';
errorMessage?: string;
error_category?: JUnitErrorCategory;
retryable?: boolean;
}

/**
* Classify a failure message into a coarse-grained category used for retry decisions.
* Mirrors the classifier in rcaTools.ts (PDX-490) so a downstream consumer sees the
* same labelling whether they consume `provar_automation_testrun.steps[]` or
* `provar_testrun_rca.failures[]`.
*
* Returns `undefined` when no pattern matches.
*/
export function classifyStepErrorCategory(errorText: string): JUnitErrorCategory | undefined {
if (/Connection reset|Failed to read client socket message|socket hang up|ECONNRESET/i.test(errorText)) {
return 'INFRASTRUCTURE';
}
if (/NoSuchElementException/i.test(errorText)) return 'LOCATOR';
if (/TimeoutException/i.test(errorText)) return 'TIMEOUT';
if (/AssertionException/i.test(errorText)) return 'ASSERTION';
if (
/SessionNotCreatedException|WebDriverException|ClassNotFoundException|LicenseException|InvalidPasswordException/i.test(
errorText
)
) {
return 'OTHER';
}
return undefined;
}

/** Only transient categories (INFRASTRUCTURE, TIMEOUT) are retryable. */
export function isStepRetryable(category: JUnitErrorCategory | undefined): boolean | undefined {
if (category === undefined) return undefined;
return category === 'INFRASTRUCTURE' || category === 'TIMEOUT';
}

export interface JUnitParseResult {
steps: JUnitStepResult[];
warning?: string;
/**
* True iff at least one JUnit XML file was located AND parsed without throwing.
* Distinguishes "we have data and the test selector matched zero cases" (legit RUN-001 signal)
* from "we have no data because nothing parsed" (insufficient info — must stay silent).
*/
parsedAny: boolean;
}

function extractFailureText(el: unknown): string | undefined {
Expand Down Expand Up @@ -1043,7 +1084,13 @@ function extractStepsFromJUnit(parsed: Record<string, unknown>): JUnitStepResult

const errorMessage = extractFailureText(tc['failure'] ?? tc['error']);
const step: JUnitStepResult = { testItemId: String(idx), title, status };
if (errorMessage) step.errorMessage = errorMessage;
if (errorMessage) {
step.errorMessage = errorMessage;
const error_category = classifyStepErrorCategory(errorMessage);
const retryable = isStepRetryable(error_category);
if (error_category !== undefined) step.error_category = error_category;
if (retryable !== undefined) step.retryable = retryable;
}
steps.push(step);
}
}
Expand Down Expand Up @@ -1071,14 +1118,15 @@ function findXmlFiles(dir: string): string[] {
*/
export function parseJUnitResults(resultsDir: string): JUnitParseResult {
if (!fs.existsSync(resultsDir)) {
return { steps: [], warning: `Results directory not found: ${resultsDir}` };
return { steps: [], warning: `Results directory not found: ${resultsDir}`, parsedAny: false };
}

const xmlFiles = findXmlFiles(resultsDir);
if (xmlFiles.length === 0) {
return {
steps: [],
warning: 'No JUnit XML files found in results directory — structured step output unavailable.',
parsedAny: false,
};
}

Expand Down Expand Up @@ -1111,20 +1159,25 @@ export function parseJUnitResults(resultsDir: string): JUnitParseResult {
return {
steps: [],
warning: 'JUnit XML files found but could not be parsed — structured step output unavailable.',
parsedAny: false,
};
}
if (allSteps.length === 0) {
// We did parse at least one file; the file just had zero <testcase> entries (or none we could
// recognise as steps). This is the legitimate "selector matched nothing" signal that RUN-001
// is built to catch.
return {
steps: [],
warning: 'JUnit XML found but no test steps could be extracted — files may not be standard JUnit format.',
parsedAny: true,
};
}

const warning =
parseFailures > 0
? `${parseFailures} JUnit XML file(s) could not be parsed — step data may be incomplete.`
: undefined;
return { steps: allSteps, warning };
return { steps: allSteps, warning, parsedAny: true };
}

// ── Registration ──────────────────────────────────────────────────────────────
Expand Down
76 changes: 65 additions & 11 deletions src/mcp/tools/automationTools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import { makeError, makeRequestId } from '../schemas/common.js';
import { log } from '../logging/logger.js';
import type { ServerConfig } from '../server.js';
import { assertPathAllowed, PathPolicyError } from '../security/pathPolicy.js';
import { WARNING_CODES, formatWarning } from '../utils/warningCodes.js';
import { parseJUnitResults } from './antTools.js';
import { runSfCommand } from './sfSpawn.js';
import { desc } from './descHelper.js';
Expand Down Expand Up @@ -226,6 +227,38 @@ function readResultsPathFromSfConfig(config: ServerConfig): string | null {

// ── Tool: provar_automation_testrun ───────────────────────────────────────────

/**
* JUnit introspection for the testrun response. Returns enough structure that
* downstream warning emitters (RUN-001 zero-tests, future JUNIT-001 expected-vs-
* actual mismatch) can read a single object instead of re-parsing.
*/
type JUnitIntrospection = {
steps: ReturnType<typeof parseJUnitResults>['steps'];
stepCount: number;
parseWarning: string | undefined;
resultsPathResolved: boolean;
/**
* True iff at least one JUnit XML file was located AND parsed without throwing.
* Gates RUN-001: a `stepCount === 0` only means "zero tests executed" when we know we
* actually have parseable data. With `parsedAny === false` the count is "we don't know",
* which must stay silent (details.warning already covers it).
*/
parsedAny: boolean;
};

function introspectJUnit(config: ServerConfig): JUnitIntrospection {
const resultsPath = readResultsPathFromSfConfig(config);
if (!resultsPath) {
return { steps: [], stepCount: 0, parseWarning: undefined, resultsPathResolved: false, parsedAny: false };
}
const { steps, warning, parsedAny } = parseJUnitResults(resultsPath);
return { steps, stepCount: steps.length, parseWarning: warning, resultsPathResolved: true, parsedAny };
}

const ZERO_TESTS_MESSAGE =
'Test run exited successfully but zero tests were executed. ' +
'Check the testCase / testCases (note spelling) field in provardx-properties.json.';

export function registerAutomationTestRun(server: McpServer, config: ServerConfig): void {
server.registerTool(
'provar_automation_testrun',
Expand All @@ -240,9 +273,12 @@ export function registerAutomationTestRun(server: McpServer, config: ServerConfi
'For grid/CI execution via Provar Quality Hub instead of running locally, use provar_qualityhub_testrun.',
'Output buffer: a 50 MB maxBuffer is set so ENOBUFS on verbose Provar runs is now rare.',
'If ENOBUFS still occurs (extremely verbose logging), run `sf provar automation test run --json` directly in the terminal and pipe or tail the output instead of retrying this tool.',
'Zero-tests guard: if the sf exit code is 0, the results directory was located, and at least one JUnit XML file parsed successfully but contains zero executed tests, a RUN-001 warning is added to `warnings[]` — usually a typo such as `testCase` vs `testCases` in provardx-properties.json. When no JUnit data is available (dir missing or all XML unparseable), `details.warning` is set instead and RUN-001 stays silent.',
'Typical local AI loop: config.load → compile → testrun → inspect results.',
'Each failed step in `steps[]` may include optional error_category (INFRASTRUCTURE|ASSERTION|LOCATOR|TIMEOUT|OTHER)',
'and retryable (boolean) fields when the failure text matches a known pattern — use these to drive automated retry policy.',
].join(' '),
'Run local Provar tests via sf CLI; requires config_load first.'
'Run local Provar tests via sf CLI; requires config_load first. Surfaces RUN-001 on zero-tests-executed.'
),
inputSchema: {
flags: z
Expand Down Expand Up @@ -274,11 +310,10 @@ export function registerAutomationTestRun(server: McpServer, config: ServerConfi
const result = runSfCommand(['provar', 'automation', 'test', 'run', ...flags], sf_path);
const { filtered, suppressed } = filterTestRunOutput(result.stdout);

// Attempt to enrich the response with structured step data from JUnit XML
const resultsPath = readResultsPathFromSfConfig(config);
const { steps, warning: junitWarning } = resultsPath
? parseJUnitResults(resultsPath)
: { steps: [], warning: undefined };
// Enrich the response with structured step data + warning hooks from JUnit XML.
// Single introspection call keeps the wiring extensible (e.g. future JUNIT-001
// expected-vs-actual mismatch can read stepCount from the same struct).
const junit = introspectJUnit(config);

if (result.exitCode !== 0) {
const { filtered: filteredErr, suppressed: suppressedErr } = filterTestRunOutput(
Expand All @@ -288,11 +323,11 @@ export function registerAutomationTestRun(server: McpServer, config: ServerConfi
...makeError('AUTOMATION_TESTRUN_FAILED', filteredErr, requestId),
...(suppressedErr > 0 ? { output_lines_suppressed: suppressedErr } : {}),
};
if (steps.length > 0) errBody['steps'] = steps;
if (!resultsPath || junitWarning) {
if (junit.steps.length > 0) errBody['steps'] = junit.steps;
if (!junit.resultsPathResolved || junit.parseWarning) {
errBody['details'] = {
warning:
junitWarning ??
junit.parseWarning ??
'Could not locate results directory — step-level output unavailable. Run provar_automation_config_load first.',
};
}
Expand All @@ -306,8 +341,27 @@ export function registerAutomationTestRun(server: McpServer, config: ServerConfi
stderr: result.stderr,
};
if (suppressed > 0) response['output_lines_suppressed'] = suppressed;
if (steps.length > 0) response['steps'] = steps;
if (junitWarning) response['details'] = { warning: junitWarning };
if (junit.steps.length > 0) response['steps'] = junit.steps;
if (junit.parseWarning) response['details'] = { warning: junit.parseWarning };

// RUN-001: sf reported success but zero tests actually executed.
// Almost always a typo in the testCase / testCases field of provardx-properties.json.
// Only fires when:
// 1. The results dir was located (resultsPathResolved), AND
// 2. At least one JUnit XML file was successfully parsed (parsedAny).
// Without (2) `stepCount === 0` just means "we don't have parseable data" — not
// "zero tests ran" — and the agent would be misdirected toward a typo when the
// real issue is a missing/unparseable results dir. That case is already surfaced
// via `details.warning` from the parse layer. With parsedAny === true and zero
// extracted steps, we know the selector genuinely matched nothing.
if (junit.resultsPathResolved && junit.parsedAny && junit.stepCount === 0) {
const warningStr = formatWarning(WARNING_CODES.RUN_001, ZERO_TESTS_MESSAGE);
// Append rather than overwrite so future warning emitters (e.g. JUNIT-001 mismatch
// in PDX-491) can coexist on the same response without stepping on each other.
const existing = response['warnings'] as string[] | undefined;
response['warnings'] = existing ? existing.concat(warningStr) : [warningStr];
}

return { content: [{ type: 'text' as const, text: JSON.stringify(response) }], structuredContent: response };
} catch (err) {
return handleSpawnError(err, requestId, 'provar_automation_testrun');
Expand Down
Loading