Salesforce Deployment Failure Analyser

Debug Salesforce deployment failures in seconds using AI-powered root cause analysis with Claude. When a deployment fails, engineers waste time switching between logs, coverage reports, and static analysis tools to manually piece together what went wrong — this tool does that correlation automatically and tells you exactly what to fix first.

Salesforce's CLI tools produce structured JSON output that no open source tool currently cross-correlates into a ranked diagnosis. This project fills that gap — there is no equivalent in the Salesforce ecosystem today.

Problem

When a Salesforce deployment fails, the information you need is split across three tools that have no awareness of each other:

The deployment log tells you OpportunityService threw a NullPointerException — but not why
The coverage report shows 62% — but has no connection to the exception
PMD flags 120 violations — but doesn't tell you which ones caused the failure

An engineer must open all three, read each output, reason about the connections under pressure, and decide what to fix first. That process is slow, error-prone, and entirely manual.

Solution

This tool accepts your deployment metrics as a single JSON payload, sends them to Claude with a strict schema contract, and returns a structured diagnosis:

Risks — each issue ranked by severity with the specific component named
Root causes — the technical reason per component, not a restatement of the error
Recommendations — ordered P0 → P3, with exact fixes ready to act on

Unlike traditional DevOps tools that surface logs and metrics, this system provides actionable insights and prioritizes fixes using LLM-based reasoning across all three signals simultaneously.

How It Differs from Standard DevOps Tools

	Traditional tools	This project
Output	Raw logs, metrics, violation counts	Root cause, ranked recommendations
Signal handling	One tool per signal, no cross-referencing	All three signals correlated in one pass
Next step	Engineer manually interprets and prioritises	Fix order is explicit — P0 first

Why Claude

Salesforce CLI, PMD, and coverage tools each report one signal in isolation — they show what happened, not why, and have no awareness of each other. Claude reads all three signals together: when a NullPointerException in OpportunityService coincides with low coverage and critical PMD violations, it identifies a single @TestSetup gap as the common cause — not three separate problems requiring three separate fixes. The result is a specific root cause per component and a P0-ranked fix list — without opening a single log file.

Quick Start

No API key needed — run a preset scenario instantly:

python main.py 1            # Failure  — risk score 7 🔴
python main.py 2            # Medium   — risk score 3 🟡
python main.py 3            # Healthy  — risk score 0 🟢

Live mode — send your own data to Claude:

pip install -r requirements-live.txt

# Windows
set ANTHROPIC_API_KEY=your_key_here
# Mac / Linux
export ANTHROPIC_API_KEY=your_key_here

python main.py --input mydata.json --live

End-to-end with a real Salesforce org:

sf project deploy start --json > deploy_result.json
sf scanner run --json > pmd_result.json
python parse_deployment.py --deploy deploy_result.json --pmd pmd_result.json --out input.json
python main.py --input input.json --live

What It Produces

Input — three fields from your Salesforce org:

Field	Type	Description
`code_coverage`	`float`	Org-level Apex code coverage percentage
`failed_deployments`	`list`	Each item: `component`, `error`, `failed_tests` count
`code_quality_issues`	`dict`	`pmd_violations` (total) and `critical` (severity 1–2)

Output — returned by Claude:

Field	Description
`risk_score`	0–10 integer — overall deployment health
`risk_level`	`Low`, `Medium`, or `High`
`risks[]`	Specific issues with severity and component
`root_causes[]`	Technical explanation per component
`recommendations[]`	Ordered fix actions, P0 → P3

Risk Scoring

Risk is derived from three dimensions that reflect Salesforce deployment reality:

Dimension	Max Points	Why it matters
Code coverage	+4	Below 75% is a hard platform blocker — no override possible
Deployment failures	+3	Runtime errors (+2) and failed tests (+1) stack — pipeline stops immediately
Critical PMD violations	+2	SOQL in loops, hardcoded IDs — governor limit failures under production load

Score	Risk Level
0–2	🟢 Low — deployment can proceed
3–5	🟡 Medium — quality issues, no blockers
6–10	🔴 High — at least one active blocker

Coverage carries the highest weight because it is the only signal that unconditionally blocks deployment with no workaround.

Example: Debugging a Failed Salesforce Deployment

A deployment fails mid-sprint. The engineer has three data points — coverage below threshold, a NullPointerException in OpportunityService, and 120 PMD violations — with no obvious connection between them.

Input

{
  "code_coverage": 62,
  "failed_deployments": [
    {
      "component": "OpportunityService",
      "error": "NullPointerException",
      "failed_tests": 3
    }
  ],
  "code_quality_issues": {
    "pmd_violations": 120,
    "critical": 15
  }
}

Output

{
  "risk_score": 7,
  "risk_level": "High",
  "risks": [
    {
      "issue": "Code coverage at 62% — 13 points below the 75% deployment threshold, hard blocker",
      "severity": "Critical",
      "component": "Overall Org"
    },
    {
      "issue": "NullPointerException in OpportunityService — 3 test methods failing, pipeline cannot proceed",
      "severity": "Critical",
      "component": "OpportunityService"
    },
    {
      "issue": "15 critical PMD violations — SOQL-in-loop and hardcoded record IDs confirmed, governor limit failures will occur under bulk record processing",
      "severity": "High",
      "component": "OpportunityService"
    }
  ],
  "root_causes": [
    {
      "cause": "OpportunityService accesses opp.Account.Name without a null-guard — Account is null because @TestSetup inserts Opportunity records without a parent Account, so every method that traverses the relationship throws at runtime",
      "component": "OpportunityService"
    },
    {
      "cause": "The @TestSetup gap suppresses execution of all methods that depend on related Account data — coverage dropped to 62% as a direct consequence of the test failures, not as a separate problem",
      "component": "Overall Org"
    }
  ],
  "recommendations": [
    {
      "action": "In OpportunityService, replace opp.Account.Name with opp.Account?.Name — the safe navigation operator prevents the NullPointerException when Account is not populated on the record",
      "priority": "P0 - Immediate"
    },
    {
      "action": "In OpportunityServiceTest @TestSetup, insert an Account record first and assign its Id to AccountId on each Opportunity — this unblocks all 3 failing tests and recovers coverage above 75% automatically",
      "priority": "P0 - Immediate"
    },
    {
      "action": "Run 'sf scanner run --category Performance --target force-app/main/default/classes/OpportunityService.cls' — move all SOQL calls above loop bodies into a single typed query: Map<Id, Account> accountMap = new Map<Id, Account>([SELECT Id, Name FROM Account WHERE Id IN :accountIds])",
      "priority": "P1 - High"
    },
    {
      "action": "Add two CI gates on pull requests: (1) 'sf scanner run --severity-threshold 2 --target force-app' to block on critical PMD violations; (2) parse 'sf project deploy start --test-level RunLocalTests --json' output and fail the build if result.numberTestsCompleted / result.numberTestsTotal falls below 0.80",
      "priority": "P2 - Medium"
    }
  ]
}

Impact

Without this tool: An engineer opens the deployment log, sees a NullPointerException, then separately checks coverage (62% — why?), then runs PMD and gets 120 violations with no clear connection to the runtime error. Correlation is manual, slow, and easy to get wrong under release pressure.
With this tool: One command surfaces that all three issues share a single root cause — a missing Account insert in @TestSetup. The fix is two lines of code, not three separate investigations.
What Claude identified beyond the raw input: The input only states "error": "NullPointerException" and "failed_tests": 3. Nothing in the payload mentions @TestSetup, parent records, or the Account relationship. The causal link between the exception and the coverage drop was inferred — not extracted from the data.

Example: Multi-Component Failure — Governor Limit

A trigger hitting SOQL limits under load, with two components affected simultaneously.

Input

{
  "code_coverage": 71,
  "failed_deployments": [
    {
      "component": "AccountTrigger",
      "error": "System.LimitException: Too many SOQL queries: 101",
      "failed_tests": 5
    },
    {
      "component": "ContactRollupService",
      "error": "System.NullPointerException",
      "failed_tests": 2
    }
  ],
  "code_quality_issues": {
    "pmd_violations": 34,
    "critical": 8
  }
}

Output

{
  "risk_score": 8,
  "risk_level": "High",
  "risks": [
    {
      "issue": "Code coverage at 71% — 4 points below the 75% Salesforce deployment threshold",
      "severity": "Critical",
      "component": "Overall Org"
    },
    {
      "issue": "AccountTrigger hitting SOQL governor limit (101 queries) — deployment blocked",
      "severity": "Critical",
      "component": "AccountTrigger"
    },
    {
      "issue": "ContactRollupService NullPointerException — 2 test failures indicate missing test data setup",
      "severity": "High",
      "component": "ContactRollupService"
    },
    {
      "issue": "8 critical PMD violations — SOQL-in-loop pattern likely the direct cause of AccountTrigger governor limit",
      "severity": "High",
      "component": "AccountTrigger"
    }
  ],
  "root_causes": [
    {
      "cause": "AccountTrigger executes a SOQL query inside a loop iterating over trigger records — each record fires a separate query, exceeding the 100-query governor limit when bulk records are processed",
      "component": "AccountTrigger"
    },
    {
      "cause": "ContactRollupService test class does not establish required parent Account records before creating Contact records — rollup logic encounters null parent references at runtime",
      "component": "ContactRollupService"
    },
    {
      "cause": "Coverage at 71% is suppressed by the combined test failures in AccountTrigger and ContactRollupService — fixing both components should recover coverage above threshold",
      "component": "Overall Org"
    }
  ],
  "recommendations": [
    {
      "action": "Refactor AccountTrigger to bulkify SOQL — move the query outside the loop, collect all required IDs first, then query once: Map<Id, Account> accountMap = new Map<Id, Account>([SELECT Id FROM Account WHERE Id IN :idSet])",
      "priority": "P0 - Immediate"
    },
    {
      "action": "Fix ContactRollupService test @TestSetup — insert Account records before Contact records so rollup fields have valid parent references during test execution",
      "priority": "P0 - Immediate"
    },
    {
      "action": "Run 'sf apex run test --class-names AccountTriggerTest,ContactRollupServiceTest --result-format json' and confirm both classes pass before re-deploying — verify combined coverage is above 75% in the result JSON before triggering the pipeline",
      "priority": "P1 - High"
    },
    {
      "action": "Run 'sf scanner run --category Performance --target force-app/main/default/triggers/AccountTrigger.trigger' — resolve all 8 critical PMD violations in AccountTrigger to prevent governor limit recurrence on the next bulk data load",
      "priority": "P1 - High"
    }
  ]
}

Claude connected the LimitException to the SOQL-in-loop PMD violations — identifying the critical violations as the cause of the runtime error, not a separate issue. The two failures independently suppress coverage, so both must be fixed before coverage recovers.

Real-World Applicability

Designed for integration with Salesforce DX and CI/CD pipelines — the input schema maps directly to the JSON output of sf project deploy and sf scanner run, so real deployment logs can be piped in automatically rather than collected manually. Native pipeline integration (GitHub Actions, Jenkins, Copado) is on the roadmap — the current CLI interface is the foundation for that.

Project Structure

salesforce-devops-ai-assistant/
├── main.py                    # Analyser — mocked and live Claude modes
├── parse_deployment.py        # Converts sf CLI + PMD JSON to input schema
├── requirements.txt           # Empty — mocked mode needs no packages
├── requirements-live.txt      # anthropic>=0.20.0 — for --live mode
├── tests/
│   └── test_validation.py     # Unit tests for input/output schema validators
├── sample_data/               # Three preset input scenarios
└── sample_outputs/            # Pre-generated Claude outputs for each scenario

Claude Integration

In --live mode, main.py sends a structured prompt to claude-sonnet-4-6 with the DevOps metrics and a strict JSON schema. The response is validated against the schema before display — if Claude returns a malformed response, the tool exits with a clear error rather than silently surfacing bad output.

The preset scenarios ship with pre-generated outputs for reproducible demos and offline testing. Mocked and live modes share an identical input/output contract; switching requires only the --live flag.

Contributing

Contributions are welcome. See CONTRIBUTING.md for how to add scenarios, extend the schema, or improve the prompt.

pip install pytest
python -m pytest tests/ -v

Roadmap

Near-term

--output flag to write JSON results to a file for CI pipeline integration
--quiet flag for machine-readable output (JSON only, no display formatting)
GitHub Actions example workflow for end-to-end org analysis on deployment failure

Longer-term

Apex stack trace parsing — accept raw sf project deploy --json exception stacks for line-level diagnosis
Historical diffing — compare risk scores across deployments to surface regressions early
Native CI/CD integration — GitHub Actions, Jenkins, and Copado pipeline hooks
Slack / Teams alerts — push P0 recommendations to engineering channels on detection

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Salesforce Deployment Failure Analyser

Problem

Solution

How It Differs from Standard DevOps Tools

Why Claude

Quick Start

What It Produces

Risk Scoring

Example: Debugging a Failed Salesforce Deployment

Input

Output

Impact

Example: Multi-Component Failure — Governor Limit

Input

Output

Real-World Applicability

Project Structure

Claude Integration

Contributing

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
sample_data		sample_data
sample_outputs		sample_outputs
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
main.py		main.py
parse_deployment.py		parse_deployment.py
requirements-live.txt		requirements-live.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Salesforce Deployment Failure Analyser

Problem

Solution

How It Differs from Standard DevOps Tools

Why Claude

Quick Start

What It Produces

Risk Scoring

Example: Debugging a Failed Salesforce Deployment

Input

Output

Impact

Example: Multi-Component Failure — Governor Limit

Input

Output

Real-World Applicability

Project Structure

Claude Integration

Contributing

Roadmap

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages