Debug Salesforce deployment failures in seconds using AI-powered root cause analysis with Claude. When a deployment fails, engineers waste time switching between logs, coverage reports, and static analysis tools to manually piece together what went wrong — this tool does that correlation automatically and tells you exactly what to fix first.
Salesforce's CLI tools produce structured JSON output that no open source tool currently cross-correlates into a ranked diagnosis. This project fills that gap — there is no equivalent in the Salesforce ecosystem today.
When a Salesforce deployment fails, the information you need is split across three tools that have no awareness of each other:
- The deployment log tells you
OpportunityServicethrew aNullPointerException— but not why - The coverage report shows 62% — but has no connection to the exception
- PMD flags 120 violations — but doesn't tell you which ones caused the failure
An engineer must open all three, read each output, reason about the connections under pressure, and decide what to fix first. That process is slow, error-prone, and entirely manual.
This tool accepts your deployment metrics as a single JSON payload, sends them to Claude with a strict schema contract, and returns a structured diagnosis:
- Risks — each issue ranked by severity with the specific component named
- Root causes — the technical reason per component, not a restatement of the error
- Recommendations — ordered P0 → P3, with exact fixes ready to act on
Unlike traditional DevOps tools that surface logs and metrics, this system provides actionable insights and prioritizes fixes using LLM-based reasoning across all three signals simultaneously.
| Traditional tools | This project | |
|---|---|---|
| Output | Raw logs, metrics, violation counts | Root cause, ranked recommendations |
| Signal handling | One tool per signal, no cross-referencing | All three signals correlated in one pass |
| Next step | Engineer manually interprets and prioritises | Fix order is explicit — P0 first |
Salesforce CLI, PMD, and coverage tools each report one signal in isolation — they show what happened, not why, and have no awareness of each other.
Claude reads all three signals together: when a NullPointerException in OpportunityService coincides with low coverage and critical PMD violations, it identifies a single @TestSetup gap as the common cause — not three separate problems requiring three separate fixes.
The result is a specific root cause per component and a P0-ranked fix list — without opening a single log file.
No API key needed — run a preset scenario instantly:
python main.py 1 # Failure — risk score 7 🔴
python main.py 2 # Medium — risk score 3 🟡
python main.py 3 # Healthy — risk score 0 🟢Live mode — send your own data to Claude:
pip install -r requirements-live.txt
# Windows
set ANTHROPIC_API_KEY=your_key_here
# Mac / Linux
export ANTHROPIC_API_KEY=your_key_here
python main.py --input mydata.json --liveEnd-to-end with a real Salesforce org:
sf project deploy start --json > deploy_result.json
sf scanner run --json > pmd_result.json
python parse_deployment.py --deploy deploy_result.json --pmd pmd_result.json --out input.json
python main.py --input input.json --liveInput — three fields from your Salesforce org:
| Field | Type | Description |
|---|---|---|
code_coverage |
float |
Org-level Apex code coverage percentage |
failed_deployments |
list |
Each item: component, error, failed_tests count |
code_quality_issues |
dict |
pmd_violations (total) and critical (severity 1–2) |
Output — returned by Claude:
| Field | Description |
|---|---|
risk_score |
0–10 integer — overall deployment health |
risk_level |
Low, Medium, or High |
risks[] |
Specific issues with severity and component |
root_causes[] |
Technical explanation per component |
recommendations[] |
Ordered fix actions, P0 → P3 |
Risk is derived from three dimensions that reflect Salesforce deployment reality:
| Dimension | Max Points | Why it matters |
|---|---|---|
| Code coverage | +4 | Below 75% is a hard platform blocker — no override possible |
| Deployment failures | +3 | Runtime errors (+2) and failed tests (+1) stack — pipeline stops immediately |
| Critical PMD violations | +2 | SOQL in loops, hardcoded IDs — governor limit failures under production load |
| Score | Risk Level |
|---|---|
| 0–2 | 🟢 Low — deployment can proceed |
| 3–5 | 🟡 Medium — quality issues, no blockers |
| 6–10 | 🔴 High — at least one active blocker |
Coverage carries the highest weight because it is the only signal that unconditionally blocks deployment with no workaround.
A deployment fails mid-sprint. The engineer has three data points — coverage below threshold, a NullPointerException in OpportunityService, and 120 PMD violations — with no obvious connection between them.
{
"code_coverage": 62,
"failed_deployments": [
{
"component": "OpportunityService",
"error": "NullPointerException",
"failed_tests": 3
}
],
"code_quality_issues": {
"pmd_violations": 120,
"critical": 15
}
}{
"risk_score": 7,
"risk_level": "High",
"risks": [
{
"issue": "Code coverage at 62% — 13 points below the 75% deployment threshold, hard blocker",
"severity": "Critical",
"component": "Overall Org"
},
{
"issue": "NullPointerException in OpportunityService — 3 test methods failing, pipeline cannot proceed",
"severity": "Critical",
"component": "OpportunityService"
},
{
"issue": "15 critical PMD violations — SOQL-in-loop and hardcoded record IDs confirmed, governor limit failures will occur under bulk record processing",
"severity": "High",
"component": "OpportunityService"
}
],
"root_causes": [
{
"cause": "OpportunityService accesses opp.Account.Name without a null-guard — Account is null because @TestSetup inserts Opportunity records without a parent Account, so every method that traverses the relationship throws at runtime",
"component": "OpportunityService"
},
{
"cause": "The @TestSetup gap suppresses execution of all methods that depend on related Account data — coverage dropped to 62% as a direct consequence of the test failures, not as a separate problem",
"component": "Overall Org"
}
],
"recommendations": [
{
"action": "In OpportunityService, replace opp.Account.Name with opp.Account?.Name — the safe navigation operator prevents the NullPointerException when Account is not populated on the record",
"priority": "P0 - Immediate"
},
{
"action": "In OpportunityServiceTest @TestSetup, insert an Account record first and assign its Id to AccountId on each Opportunity — this unblocks all 3 failing tests and recovers coverage above 75% automatically",
"priority": "P0 - Immediate"
},
{
"action": "Run 'sf scanner run --category Performance --target force-app/main/default/classes/OpportunityService.cls' — move all SOQL calls above loop bodies into a single typed query: Map<Id, Account> accountMap = new Map<Id, Account>([SELECT Id, Name FROM Account WHERE Id IN :accountIds])",
"priority": "P1 - High"
},
{
"action": "Add two CI gates on pull requests: (1) 'sf scanner run --severity-threshold 2 --target force-app' to block on critical PMD violations; (2) parse 'sf project deploy start --test-level RunLocalTests --json' output and fail the build if result.numberTestsCompleted / result.numberTestsTotal falls below 0.80",
"priority": "P2 - Medium"
}
]
}- Without this tool: An engineer opens the deployment log, sees a
NullPointerException, then separately checks coverage (62% — why?), then runs PMD and gets 120 violations with no clear connection to the runtime error. Correlation is manual, slow, and easy to get wrong under release pressure. - With this tool: One command surfaces that all three issues share a single root cause — a missing
Accountinsert in@TestSetup. The fix is two lines of code, not three separate investigations. - What Claude identified beyond the raw input: The input only states
"error": "NullPointerException"and"failed_tests": 3. Nothing in the payload mentions@TestSetup, parent records, or the Account relationship. The causal link between the exception and the coverage drop was inferred — not extracted from the data.
A trigger hitting SOQL limits under load, with two components affected simultaneously.
{
"code_coverage": 71,
"failed_deployments": [
{
"component": "AccountTrigger",
"error": "System.LimitException: Too many SOQL queries: 101",
"failed_tests": 5
},
{
"component": "ContactRollupService",
"error": "System.NullPointerException",
"failed_tests": 2
}
],
"code_quality_issues": {
"pmd_violations": 34,
"critical": 8
}
}{
"risk_score": 8,
"risk_level": "High",
"risks": [
{
"issue": "Code coverage at 71% — 4 points below the 75% Salesforce deployment threshold",
"severity": "Critical",
"component": "Overall Org"
},
{
"issue": "AccountTrigger hitting SOQL governor limit (101 queries) — deployment blocked",
"severity": "Critical",
"component": "AccountTrigger"
},
{
"issue": "ContactRollupService NullPointerException — 2 test failures indicate missing test data setup",
"severity": "High",
"component": "ContactRollupService"
},
{
"issue": "8 critical PMD violations — SOQL-in-loop pattern likely the direct cause of AccountTrigger governor limit",
"severity": "High",
"component": "AccountTrigger"
}
],
"root_causes": [
{
"cause": "AccountTrigger executes a SOQL query inside a loop iterating over trigger records — each record fires a separate query, exceeding the 100-query governor limit when bulk records are processed",
"component": "AccountTrigger"
},
{
"cause": "ContactRollupService test class does not establish required parent Account records before creating Contact records — rollup logic encounters null parent references at runtime",
"component": "ContactRollupService"
},
{
"cause": "Coverage at 71% is suppressed by the combined test failures in AccountTrigger and ContactRollupService — fixing both components should recover coverage above threshold",
"component": "Overall Org"
}
],
"recommendations": [
{
"action": "Refactor AccountTrigger to bulkify SOQL — move the query outside the loop, collect all required IDs first, then query once: Map<Id, Account> accountMap = new Map<Id, Account>([SELECT Id FROM Account WHERE Id IN :idSet])",
"priority": "P0 - Immediate"
},
{
"action": "Fix ContactRollupService test @TestSetup — insert Account records before Contact records so rollup fields have valid parent references during test execution",
"priority": "P0 - Immediate"
},
{
"action": "Run 'sf apex run test --class-names AccountTriggerTest,ContactRollupServiceTest --result-format json' and confirm both classes pass before re-deploying — verify combined coverage is above 75% in the result JSON before triggering the pipeline",
"priority": "P1 - High"
},
{
"action": "Run 'sf scanner run --category Performance --target force-app/main/default/triggers/AccountTrigger.trigger' — resolve all 8 critical PMD violations in AccountTrigger to prevent governor limit recurrence on the next bulk data load",
"priority": "P1 - High"
}
]
}Claude connected the LimitException to the SOQL-in-loop PMD violations — identifying the critical violations as the cause of the runtime error, not a separate issue. The two failures independently suppress coverage, so both must be fixed before coverage recovers.
Designed for integration with Salesforce DX and CI/CD pipelines — the input schema maps directly to the JSON output of sf project deploy and sf scanner run, so real deployment logs can be piped in automatically rather than collected manually.
Native pipeline integration (GitHub Actions, Jenkins, Copado) is on the roadmap — the current CLI interface is the foundation for that.
salesforce-devops-ai-assistant/
├── main.py # Analyser — mocked and live Claude modes
├── parse_deployment.py # Converts sf CLI + PMD JSON to input schema
├── requirements.txt # Empty — mocked mode needs no packages
├── requirements-live.txt # anthropic>=0.20.0 — for --live mode
├── tests/
│ └── test_validation.py # Unit tests for input/output schema validators
├── sample_data/ # Three preset input scenarios
└── sample_outputs/ # Pre-generated Claude outputs for each scenario
In --live mode, main.py sends a structured prompt to claude-sonnet-4-6 with the DevOps metrics and a strict JSON schema. The response is validated against the schema before display — if Claude returns a malformed response, the tool exits with a clear error rather than silently surfacing bad output.
The preset scenarios ship with pre-generated outputs for reproducible demos and offline testing. Mocked and live modes share an identical input/output contract; switching requires only the --live flag.
Contributions are welcome. See CONTRIBUTING.md for how to add scenarios, extend the schema, or improve the prompt.
pip install pytest
python -m pytest tests/ -vNear-term
--outputflag to write JSON results to a file for CI pipeline integration--quietflag for machine-readable output (JSON only, no display formatting)- GitHub Actions example workflow for end-to-end org analysis on deployment failure
Longer-term
- Apex stack trace parsing — accept raw
sf project deploy --jsonexception stacks for line-level diagnosis - Historical diffing — compare risk scores across deployments to surface regressions early
- Native CI/CD integration — GitHub Actions, Jenkins, and Copado pipeline hooks
- Slack / Teams alerts — push P0 recommendations to engineering channels on detection
MIT — see LICENSE.