feat: add direct OPA evaluator with shared base infrastructure by joejstuart · Pull Request #3276 · conforma/cli

joejstuart · 2026-05-06T23:07:00Z

Add a new opaEvaluator that evaluates policies directly via OPA's rego API instead of going through conftest's runner. This eliminates the conftest runner overhead while reusing conftest's Engine for policy compilation and data loading.

Extract shared infrastructure into basePolicyEvaluator to eliminate ~400 lines of duplication between the two evaluators. Both now share policy download/inspection, data directory preparation, capabilities management, post-processing, and success computation.

https://redhat.atlassian.net/browse/EC-1806

Key changes:

New opaEvaluator with direct rego.Eval() queries matching conftest's engine.Check() semantics (same regexes, exception handling, success counting)
basePolicyEvaluator embedded struct with 12 shared methods
sync.Once lazy initialization on both evaluators to prevent data races from concurrent worker goroutines
BuildInput() on ApplicationSnapshotImage for in-memory OPA input delivery without disk I/O
ParsedInput field on EvaluationTarget for passing pre-parsed input
EC_USE_OPA=1 environment variable gate for selecting the OPA evaluator

coderabbitai · 2026-05-06T23:07:14Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 6f1d8493-73be-496d-85a5-08dbe16cd473

📥 Commits

Reviewing files that changed from the base of the PR and between 09561ba and 39af44b.

📒 Files selected for processing (5)

acceptance/cli/cli.go
internal/evaluation_target/application_snapshot_image/application_snapshot_image_test.go
internal/evaluation_target/input/input.go
internal/evaluator/opa_evaluator.go
internal/evaluator/opa_evaluator_test.go

📝 Walkthrough

Walkthrough

Adds a full OPA/conftest evaluator, in-memory image input construction, evaluator selection/wiring (OPA vs Conftest), conftest success-time filtering, unit/integration/comparison tests, and BDD scenarios for OPA-enabled validation.

Changes

OPA Evaluator Implementation

Layer / File(s)	Summary
Data Shape `internal/evaluator/evaluator.go`	`EvaluationTarget` gains `ParsedInput map[string]any` to carry parsed in-memory input alongside file paths.
Input Assembly `internal/evaluation_target/application_snapshot_image/...`	New `BuildInput(ctx)` returns OPA-compatible `map[string]any` plus marshaled JSON bytes without writing to disk; `WriteInputFile(ctx)` preserved as legacy path.
Core Evaluator Framework `internal/evaluator/base_evaluator.go`	Introduces `basePolicyEvaluator` behaviors: include/exclude resolver initialization, per-run workspace (`policy`,`data`) and `capabilities.json` creation, policy source download/inspection with rule metadata extraction, data dir discovery, namespace resolution, and unified post-processing (metadata enrichment, filtering, success synthesis).
OPA Evaluator `internal/evaluator/opa_evaluator.go`	Replaces stub with full OPA/conftest-backed evaluator: constructor now accepts context, policy sources, config provider, source group and namespace; lazy init on first Evaluate; engine compilation, namespace enumeration, exception-first handling, per-rule queries, tracing/print capture, and conversion of rego results to Outcome/Result objects.
Evaluation Wiring `cmd/validate/image.go`, `internal/image/validate.go`, `internal/validate/vsa/fallback.go`	Evaluator construction calls updated to pass context, policy sources, config provider, source group and namespace. `ValidateImage` uses `BuildInput` to obtain `ParsedInput` and serialized JSON; `EvaluationTarget` populated with both.
Conftest Integration `internal/evaluator/conftest_evaluator.go`	computeSuccesses updated to accept effectiveTime and use it for unified filtering when synthesizing success results.
Tests — Unit & Lifecycle `internal/evaluator/opa_evaluator_test.go`, `internal/evaluator/conftest_evaluator_unit_*`	Tests adjusted to new lifecycles and signatures: Destroy/CapabilitiesPath tests updated, computeSuccesses tests updated to pass effectiveTime; comment-only adjustments in one data test.
Tests — Integration & Comparison `internal/evaluator/opa_evaluator_integration_test.go`, `internal/evaluator/evaluator_comparison_test.go`	New integration suites (OPA evaluator) and cross-evaluator comparison tests validating parity across many rule/input scenarios and component-name filtering.
BDD / Acceptance `features/validate_image_opa.feature`, `features/validate_input_opa.feature`, `acceptance/cli/cli.go`	Feature files exercising `ec validate image` and `ec validate input` with `EC_USE_OPA=1`; test runner hook preserves process `EC_USE_OPA` into scenarios.
Deps `acceptance/go.mod`	Adds `github.com/sigstore/sigstore-go v1.1.4` to acceptance module.

sequenceDiagram
    actor User
    participant CLI as CLI
    participant ValidateCmd as cmd/validate/image.go
    participant AppSnapshot as ApplicationSnapshotImage
    participant OPAEvaluator as NewOPAEvaluator
    participant BaseEval as basePolicyEvaluator
    participant OPAEngine as conftest.Engine

    User->>CLI: ec validate image --policy <policy>
    CLI->>ValidateCmd: invoke validate
    ValidateCmd->>AppSnapshot: BuildInput(ctx)
    AppSnapshot-->>ValidateCmd: ParsedInput + input JSON
    ValidateCmd->>OPAEvaluator: NewOPAEvaluator(ctx, policySources, ...)
    OPAEvaluator->>BaseEval: init resolver & workspace
    ValidateCmd->>OPAEvaluator: Evaluate(ctx, EvaluationTarget{ParsedInput, Inputs})
    OPAEvaluator->>OPAEvaluator: ensureInitialized (download/inspect)
    OPAEvaluator->>OPAEngine: compile engine with policies & data
    OPAEvaluator->>OPAEngine: evaluate namespace queries (exceptions first)
    OPAEngine-->>OPAEvaluator: query results
    OPAEvaluator->>BaseEval: postProcessResults (enrich, filter, synthesize successes)
    BaseEval-->>ValidateCmd: Outcomes (failures/warnings/passes)
    ValidateCmd-->>User: JSON report + exit code

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change: introducing a new OPA evaluator with shared base infrastructure extracted into a common layer.
Description check	✅ Passed	The description clearly explains the purpose, key changes, and benefits of the pull request, relating directly to the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

internal/image/validate.go (1)
109-119: ⚡ Quick win

Avoid building the same policy input twice per image.

BuildInput already marshals/unmarshals the full payload, and WriteInputFile then marshals it again immediately afterward. On the conftest path this is pure overhead in a hot loop, and it also creates two places that must stay semantically identical. Prefer a single shared builder, or only populate ParsedInput when an OPA evaluator is actually present.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/image/validate.go` around lines 109 - 119, The code currently calls
BuildInput and then WriteInputFile, causing duplicate marshaling; instead call
BuildInput once and reuse its result when writing files or populating
ParsedInput. Modify WriteInputFile (or add an overload) to accept the
already-built inputJSON/inputMap from BuildInput so it does not re-marshal, and
only populate a.ParsedInput (or run the OPA/conftest path) when an OPA
evaluator/conftest is actually present; update callers to pass the BuildInput
output into WriteInputFile or skip WriteInputFile when no evaluator is used
(referencing BuildInput, WriteInputFile, and ParsedInput in your changes).
internal/evaluator/opa_evaluator.go (1)
243-267: ⚡ Quick win

Cache rule discovery per namespace after compilation.

This rescans every compiled module for every (namespace, input) pair, then does an O(n²) dedupe on top. With many inputs, rule discovery becomes part of the hot path even though o.engine.Modules() is immutable after compileEngine(). Precomputing a map[string][]string once during initialization would remove that repeated AST walk and simplify queryNamespace.

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/evaluator/opa_evaluator.go` around lines 243 - 267, Precompute and
cache discovery of OPA failure/warning rule names per namespace during engine
initialization (e.g., in compileEngine or immediately after it) instead of
recomputing inside queryNamespace by walking o.engine.Modules() each time; build
a map[string][]string (namespace -> ruleNames) using the same filtering logic
(isOPAFailure/isOPAWarning and dedupe) and store it on the evaluator struct,
update usages of ruleNames/ruleCount in queryNamespace to read from that cache,
and ensure cache is populated once after compileEngine() so o.engine.Modules()
is not rescanned for every (namespace, input) pair.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/evaluator/base_evaluator.go`:
- Around line 332-333: computeSuccesses is calling FilterResults with time.Now()
and discarding its returned missingIncludes; change those calls to pass the
evaluator's effective time (use b.effectiveTime or the evaluator's effectiveTime
field) instead of time.Now(), and capture/propagate the missingIncludes return
value so result.Successes is computed with the same effective time and the
missingIncludes set is not lost; apply the same fix to the other FilterResults
call in the same file (the block around the alternate call at ~406-412) so both
successes and the other filter use the evaluator effective time and preserve
missingIncludes.
- Around line 307-325: The loop always constructs
NewUnifiedPostEvaluationFilter, which overrides any custom filter set by
NewConftestEvaluatorWithPostEvaluationFilter; change the code to use the
evaluator's configured post-evaluation filter (e.g., b.postEvaluationFilter)
when present, falling back to NewUnifiedPostEvaluationFilter(b.policyResolver)
only if b.postEvaluationFilter is nil, and then call FilterResults and
CategorizeResults on that chosen filter instance so custom filters are preserved
(apply this to the variables used around FilterResults/CategorizeResults).

---

Nitpick comments:
In `@internal/evaluator/opa_evaluator.go`:
- Around line 243-267: Precompute and cache discovery of OPA failure/warning
rule names per namespace during engine initialization (e.g., in compileEngine or
immediately after it) instead of recomputing inside queryNamespace by walking
o.engine.Modules() each time; build a map[string][]string (namespace ->
ruleNames) using the same filtering logic (isOPAFailure/isOPAWarning and dedupe)
and store it on the evaluator struct, update usages of ruleNames/ruleCount in
queryNamespace to read from that cache, and ensure cache is populated once after
compileEngine() so o.engine.Modules() is not rescanned for every (namespace,
input) pair.

In `@internal/image/validate.go`:
- Around line 109-119: The code currently calls BuildInput and then
WriteInputFile, causing duplicate marshaling; instead call BuildInput once and
reuse its result when writing files or populating ParsedInput. Modify
WriteInputFile (or add an overload) to accept the already-built
inputJSON/inputMap from BuildInput so it does not re-marshal, and only populate
a.ParsedInput (or run the OPA/conftest path) when an OPA evaluator/conftest is
actually present; update callers to pass the BuildInput output into
WriteInputFile or skip WriteInputFile when no evaluator is used (referencing
BuildInput, WriteInputFile, and ParsedInput in your changes).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 46c4d173-fd5a-48f1-94ce-2a52bf4b8246

📥 Commits

Reviewing files that changed from the base of the PR and between 68bb316 and 8989ef7.

📒 Files selected for processing (9)

cmd/validate/image.go
internal/evaluation_target/application_snapshot_image/application_snapshot_image.go
internal/evaluator/base_evaluator.go
internal/evaluator/conftest_evaluator.go
internal/evaluator/conftest_evaluator_unit_data_test.go
internal/evaluator/evaluator.go
internal/evaluator/opa_evaluator.go
internal/evaluator/opa_evaluator_test.go
internal/image/validate.go

coderabbitai · 2026-05-06T23:12:55Z

+	for _, result := range runResults {
+		unifiedFilter := NewUnifiedPostEvaluationFilter(b.policyResolver)
+
+		allResults := []Result{}
+		allResults = append(allResults, result.Warnings...)
+		allResults = append(allResults, result.Failures...)
+		allResults = append(allResults, result.Exceptions...)
+		allResults = append(allResults, result.Skipped...)
+
+		for j := range allResults {
+			addRuleMetadata(ctx, &allResults[j], b.rules)
+		}
+
+		filteredResults, updatedMissingIncludes := unifiedFilter.FilterResults(
+			allResults, b.allRules, target.Target, missingIncludes, effectiveTime)
+		missingIncludes = updatedMissingIncludes
+
+		warnings, failures, exceptions, skipped := unifiedFilter.CategorizeResults(
+			filteredResults, result, effectiveTime)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve caller-supplied post-evaluation filters.

postProcessResults now always creates NewUnifiedPostEvaluationFilter(...), so NewConftestEvaluatorWithPostEvaluationFilter no longer changes behavior after this refactor. That is a functional regression for any path relying on a custom filter.

Suggested fix

-func (b *basePolicyEvaluator) postProcessResults(ctx context.Context, runResults []Outcome, target EvaluationTarget) ([]Outcome, error) { +func (b *basePolicyEvaluator) postProcessResults(ctx context.Context, runResults []Outcome, target EvaluationTarget, filter PostEvaluationFilter) ([]Outcome, error) { ... for _, result := range runResults { - unifiedFilter := NewUnifiedPostEvaluationFilter(b.policyResolver) + unifiedFilter := filter + if unifiedFilter == nil { + unifiedFilter = NewUnifiedPostEvaluationFilter(b.policyResolver) + } ... }

-return c.postProcessResults(ctx, runResults, target) +return c.postProcessResults(ctx, runResults, target, c.postEvaluationFilter)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/evaluator/base_evaluator.go` around lines 307 - 325, The loop always constructs NewUnifiedPostEvaluationFilter, which overrides any custom filter set by NewConftestEvaluatorWithPostEvaluationFilter; change the code to use the evaluator's configured post-evaluation filter (e.g., b.postEvaluationFilter) when present, falling back to NewUnifiedPostEvaluationFilter(b.policyResolver) only if b.postEvaluationFilter is nil, and then call FilterResults and CategorizeResults on that chosen filter instance so custom filters are preserved (apply this to the variables used around FilterResults/CategorizeResults).

codecov · 2026-05-07T03:04:00Z

Codecov Report

❌ Patch coverage is 86.92153% with 65 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
internal/evaluator/base_evaluator.go	85.77%	33 Missing ⚠️
internal/evaluator/opa_evaluator.go	88.18%	26 Missing ⚠️
...ation_snapshot_image/application_snapshot_image.go	93.54%	2 Missing ⚠️
internal/image/validate.go	75.00%	2 Missing ⚠️
internal/validate/vsa/fallback.go	0.00%	2 Missing ⚠️

Flag	Coverage Δ
acceptance	`55.97% <72.83%> (+0.80%)`	⬆️
generative	`17.10% <0.00%> (-0.79%)`	⬇️
integration	`28.12% <61.97%> (+1.45%)`	⬆️
unit	`68.93% <66.39%> (-0.09%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
cmd/validate/image.go	`91.62% <100.00%> (+0.25%)`	⬆️
internal/evaluator/conftest_evaluator.go	`88.65% <100.00%> (ø)`
...ation_snapshot_image/application_snapshot_image.go	`83.96% <93.54%> (+1.28%)`	⬆️
internal/image/validate.go	`70.42% <75.00%> (-0.39%)`	⬇️
internal/validate/vsa/fallback.go	`36.84% <0.00%> (-0.50%)`	⬇️
internal/evaluator/opa_evaluator.go	`88.18% <88.18%> (-11.82%)`	⬇️
internal/evaluator/base_evaluator.go	`85.77% <85.77%> (ø)`

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add a new opaEvaluator that evaluates policies directly via OPA's rego API instead of going through conftest's runner. This eliminates the conftest runner overhead while reusing conftest's Engine for policy compilation and data loading. Extract shared infrastructure into basePolicyEvaluator to eliminate ~400 lines of duplication between the two evaluators. Both now share policy download/inspection, data directory preparation, capabilities management, post-processing, and success computation. Key changes: - New opaEvaluator with direct rego.Eval() queries matching conftest's engine.Check() semantics (same regexes, exception handling, success counting) - basePolicyEvaluator embedded struct with 12 shared methods - sync.Once lazy initialization on both evaluators to prevent data races from concurrent worker goroutines - BuildInput() on ApplicationSnapshotImage for in-memory OPA input delivery without disk I/O - ParsedInput field on EvaluationTarget for passing pre-parsed input - EC_USE_OPA=1 environment variable gate for selecting the OPA evaluator Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

BuildInput was missing ComponentName and PolicySpec fields that WriteInputFile includes, causing acceptance test snapshot failures. Also fix gci import ordering in fallback.go and tidy acceptance/go.mod. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add unit tests covering evalOPAQuery, queryNamespace, evaluateWithEngine, helper functions (isOPAFailure, isOPAWarning, stripRulePrefix), input parsing, and base evaluator methods (prepareDataDirs, computeSuccesses, postProcessResults, createDataDirectory, createCapabilitiesFile, initWorkDir, initPolicyResolver, resolveFilteredNamespaces, isResultIncluded). Also fix gci import ordering in base_evaluator.go and opa_evaluator.go. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Full end-to-end integration tests for the OPA evaluator covering: - Basic creation and capabilities path - Evaluation with file-based and parsed input - Deny/warn semantics (both triggered, warn only, all pass) - Component-name-based VolatileConfig exclude filtering Mirrors the existing conftest evaluator integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Run both evaluators against the same policy and input, assert identical outcomes (failure/warning/success codes and messages). Covers deny, warn, conditional rules, multiple rules, parsed vs file input, and component-name-based VolatileConfig filtering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Inject EC_USE_OPA from the test runner's process environment into every acceptance scenario's ec binary invocation, enabling `EC_USE_OPA=1 make acceptance` to run the full suite with the OPA evaluator. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add validate_image_opa.feature (7 scenarios) and validate_input_opa.feature (2 scenarios) that exercise the OPA evaluator end-to-end via EC_USE_OPA=1. Covers happy day, rejection, multiple sources, rule filtering, future deny conversion, volatile config, and input validation paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Generated by running the OPA acceptance test scenarios locally. These snapshots enable CI to validate OPA evaluator output matches expected results. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cmd/validate/image.go (1)

320-334: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Misleading debug log when OPA evaluator fails to initialize.

The error path logs "Failed to initialize the conftest evaluator!" regardless of which evaluator branch was taken. With the new OPA path this will be confusing during debugging — use a branch-aware message (or include the evaluator type).

Suggested fix

 			if err != nil {
-				log.Debug("Failed to initialize the conftest evaluator!")
+				log.Debugf("Failed to initialize evaluator (opa=%v): %v", utils.IsOpaEnabled(), err)
 				return err
 			}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/validate/image.go` around lines 320 - 334, The debug message is ambiguous
when evaluator initialization fails; update the error logging in the block that
calls newOPAEvaluator and evaluator.NewConftestEvaluatorWithFilterType so the
log includes which evaluator failed (e.g., "Failed to initialize OPA evaluator"
when utils.IsOpaEnabled() branch is taken, and "Failed to initialize Conftest
evaluator with filter" for the other branch) or include the evaluator type in
the single message; modify the log.Debug call that follows the calls to
newOPAEvaluator and evaluator.NewConftestEvaluatorWithFilterType to reflect the
active evaluator (reference newOPAEvaluator,
evaluator.NewConftestEvaluatorWithFilterType, and utils.IsOpaEnabled()) so
debugging shows the correct evaluator type.

🧹 Nitpick comments (3)

internal/evaluator/evaluator_comparison_test.go (2)

130-140: ⚡ Quick win

Local variable os shadows the os package import.

Inside assertSameOutcomes, os := summarizeOutcomes(opaResults) shadows the imported os package. It's harmless today (no os.* use in this function) but is confusing and a footgun for future edits. Rename the local to e.g. oo / opaSummary.

Suggested fix

-func assertSameOutcomes(t *testing.T, label string, conftestResults, opaResults []Outcome) {
-	t.Helper()
-	cs := summarizeOutcomes(conftestResults)
-	os := summarizeOutcomes(opaResults)
-
-	assert.Equal(t, cs.failureCodes, os.failureCodes, "%s: failure codes differ", label)
-	assert.Equal(t, cs.warningCodes, os.warningCodes, "%s: warning codes differ", label)
-	assert.Equal(t, cs.successCodes, os.successCodes, "%s: success codes differ", label)
-	assert.Equal(t, cs.failureMsgs, os.failureMsgs, "%s: failure messages differ", label)
-	assert.Equal(t, cs.warningMsgs, os.warningMsgs, "%s: warning messages differ", label)
-}
+func assertSameOutcomes(t *testing.T, label string, conftestResults, opaResults []Outcome) {
+	t.Helper()
+	cs := summarizeOutcomes(conftestResults)
+	oo := summarizeOutcomes(opaResults)
+
+	assert.Equal(t, cs.failureCodes, oo.failureCodes, "%s: failure codes differ", label)
+	assert.Equal(t, cs.warningCodes, oo.warningCodes, "%s: warning codes differ", label)
+	assert.Equal(t, cs.successCodes, oo.successCodes, "%s: success codes differ", label)
+	assert.Equal(t, cs.failureMsgs, oo.failureMsgs, "%s: failure messages differ", label)
+	assert.Equal(t, cs.warningMsgs, oo.warningMsgs, "%s: warning messages differ", label)
+}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/evaluator/evaluator_comparison_test.go` around lines 130 - 140, The
local variable os in assertSameOutcomes shadows the imported os package; rename
it (e.g., opaSummary or oo) and update all subsequent uses (the comparisons of
failureCodes, warningCodes, successCodes, failureMsgs, warningMsgs) to reference
the new variable; locate the variable created by calling
summarizeOutcomes(opaResults) inside assertSameOutcomes and change that
identifier and its uses so the os package import is no longer shadowed.

130-140: ⚡ Quick win

Exception messages are summarized but never compared.

summarizeOutcomes populates exceptionMsgs, but assertSameOutcomes does not assert on it, so any divergence in exception handling between conftest and OPA goes unnoticed. Either drop the field or include it in the equality checks to actually catch regressions.

Suggested fix

 	assert.Equal(t, cs.failureMsgs, os.failureMsgs, "%s: failure messages differ", label)
 	assert.Equal(t, cs.warningMsgs, os.warningMsgs, "%s: warning messages differ", label)
+	assert.Equal(t, cs.exceptionMsgs, os.exceptionMsgs, "%s: exception messages differ", label)
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/evaluator/evaluator_comparison_test.go` around lines 130 - 140, The
test helper assertSameOutcomes omits comparing the exceptionMsgs field produced
by summarizeOutcomes, so add an assertion comparing cs.exceptionMsgs and
os.exceptionMsgs (e.g., using assert.Equal with the same "%s: exception messages
differ" label) to catch divergences between conftest and OPA; alternatively, if
exceptionMsgs is intentionally irrelevant, remove/populate it consistently in
summarizeOutcomes so it no longer exists as a source of silent differences.
Ensure references to summarizeOutcomes, assertSameOutcomes, and the
exceptionMsgs field are updated accordingly.

features/validate_image_opa.feature (1)

110-119: 💤 Low value

Inline JSON --policy argument is fragile.

Unlike the other scenarios that reference a named ec-policy, this one inlines JSON directly on the command line. The acceptance harness splits args by literal space (strings.Split(args, " ")), so this only works because the JSON has no spaces — any future formatting (e.g. adding a space after : or ,) will silently break the scenario. Consider switching to a named policy configuration like the other scenarios for consistency and robustness.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@features/validate_image_opa.feature` around lines 110 - 119, The scenario in
features/validate_image_opa.feature inlines JSON into the --policy argument
which is fragile due to the test harness splitting args on spaces; replace the
inline JSON with a named ec-policy like the other scenarios: add a "Given an
ec-policy named \"future-deny\" with" block that contains the same git source
configuration (e.g. the git::https://... future-deny-policy repo), and update
the ec command to use the policy variable reference (e.g. --policy
${future-deny_POLICY}) instead of the inline JSON so the harness passes the
policy reliably.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cmd/validate/image.go`:
- Around line 320-334: The debug message is ambiguous when evaluator
initialization fails; update the error logging in the block that calls
newOPAEvaluator and evaluator.NewConftestEvaluatorWithFilterType so the log
includes which evaluator failed (e.g., "Failed to initialize OPA evaluator" when
utils.IsOpaEnabled() branch is taken, and "Failed to initialize Conftest
evaluator with filter" for the other branch) or include the evaluator type in
the single message; modify the log.Debug call that follows the calls to
newOPAEvaluator and evaluator.NewConftestEvaluatorWithFilterType to reflect the
active evaluator (reference newOPAEvaluator,
evaluator.NewConftestEvaluatorWithFilterType, and utils.IsOpaEnabled()) so
debugging shows the correct evaluator type.

---

Nitpick comments:
In `@features/validate_image_opa.feature`:
- Around line 110-119: The scenario in features/validate_image_opa.feature
inlines JSON into the --policy argument which is fragile due to the test harness
splitting args on spaces; replace the inline JSON with a named ec-policy like
the other scenarios: add a "Given an ec-policy named \"future-deny\" with" block
that contains the same git source configuration (e.g. the git::https://...
future-deny-policy repo), and update the ec command to use the policy variable
reference (e.g. --policy ${future-deny_POLICY}) instead of the inline JSON so
the harness passes the policy reliably.

In `@internal/evaluator/evaluator_comparison_test.go`:
- Around line 130-140: The local variable os in assertSameOutcomes shadows the
imported os package; rename it (e.g., opaSummary or oo) and update all
subsequent uses (the comparisons of failureCodes, warningCodes, successCodes,
failureMsgs, warningMsgs) to reference the new variable; locate the variable
created by calling summarizeOutcomes(opaResults) inside assertSameOutcomes and
change that identifier and its uses so the os package import is no longer
shadowed.
- Around line 130-140: The test helper assertSameOutcomes omits comparing the
exceptionMsgs field produced by summarizeOutcomes, so add an assertion comparing
cs.exceptionMsgs and os.exceptionMsgs (e.g., using assert.Equal with the same
"%s: exception messages differ" label) to catch divergences between conftest and
OPA; alternatively, if exceptionMsgs is intentionally irrelevant,
remove/populate it consistently in summarizeOutcomes so it no longer exists as a
source of silent differences. Ensure references to summarizeOutcomes,
assertSameOutcomes, and the exceptionMsgs field are updated accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 51611af9-0a4c-434d-9ddf-f896de61fbc2

📥 Commits

Reviewing files that changed from the base of the PR and between 8989ef7 and 14ba351.

⛔ Files ignored due to path filters (2)

features/__snapshots__/validate_image_opa.snap is excluded by !**/*.snap
features/__snapshots__/validate_input_opa.snap is excluded by !**/*.snap

📒 Files selected for processing (15)

acceptance/cli/cli.go
acceptance/go.mod
cmd/validate/image.go
features/validate_image_opa.feature
features/validate_input_opa.feature
internal/evaluation_target/application_snapshot_image/application_snapshot_image.go
internal/evaluator/base_evaluator.go
internal/evaluator/conftest_evaluator_unit_data_test.go
internal/evaluator/evaluator.go
internal/evaluator/evaluator_comparison_test.go
internal/evaluator/opa_evaluator.go
internal/evaluator/opa_evaluator_integration_test.go
internal/evaluator/opa_evaluator_test.go
internal/image/validate.go
internal/validate/vsa/fallback.go

💤 Files with no reviewable changes (1)

internal/evaluator/conftest_evaluator_unit_data_test.go

✅ Files skipped from review due to trivial changes (2)

acceptance/go.mod
internal/evaluation_target/application_snapshot_image/application_snapshot_image.go

🚧 Files skipped from review as they are similar to previous changes (5)

internal/evaluator/evaluator.go
internal/image/validate.go
internal/evaluator/opa_evaluator_test.go
internal/evaluator/opa_evaluator.go
internal/evaluator/base_evaluator.go

Success filtering was using time.Now() instead of the evaluator's effective time, causing inconsistent behavior for backdated runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/evaluator/conftest_evaluator.go`:
- Around line 861-863: FilterResults is returning an updated missingIncludes map
but the call at filteredResults, _ :=
unifiedFilter.FilterResults([]Result{success}, rules, imageRef, componentName,
missingIncludes, effectiveTime) discards it; update the call to capture and
reassign the returned missingIncludes (e.g., filteredResults, missingIncludes =
unifiedFilter.FilterResults(...)) so later logic sees include criteria satisfied
by success results; ensure you propagate that reassigned missingIncludes
variable wherever it is subsequently used.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: a7e82c60-3bb0-4fc1-8970-15b8b361e652

📥 Commits

Reviewing files that changed from the base of the PR and between 14ba351 and 09561ba.

📒 Files selected for processing (4)

internal/evaluator/base_evaluator.go
internal/evaluator/conftest_evaluator.go
internal/evaluator/conftest_evaluator_unit_filtering_test.go
internal/evaluator/opa_evaluator_test.go

✅ Files skipped from review due to trivial changes (1)

internal/evaluator/base_evaluator.go

🚧 Files skipped from review as they are similar to previous changes (1)

internal/evaluator/opa_evaluator_test.go

joejstuart · 2026-05-08T14:36:23Z

/retest

simonbaird · 2026-05-08T16:00:34Z

Jira?

st3penta

I left a few comments, AI-found but human-reviewed

st3penta · 2026-05-20T15:25:08Z

+				continue
+			}
+			for _, v := range expressionValues {
+				switch val := v.(type) {


The type switch has no default case. Values that aren't string or map[string]any (e.g. bool from exception queries, int) are silently dropped — an empty Result{} is never appended for them. Since this is a policy enforcement tool, a silently dropped denial is a fail-open bug.

Add a default case that either returns an error or appends a result with a descriptive message so unexpected types surface instead of vanishing.

Good catch — added a default case that surfaces unexpected types as a result message with the type and value, so they're visible rather than silently dropped. Also added a test case for it.

st3penta · 2026-05-20T15:25:08Z

+			ruleCount++
+			found := false
+			for _, existing := range ruleNames {
+				if strings.EqualFold(existing, name) {


strings.EqualFold makes dedup case-insensitive, but OPA/Rego rule names are case-sensitive. Two distinct rules like deny_Check and deny_check would collapse into one, silently dropping whichever comes second.

Not a problem with current conforma policies (all bare deny/warn), but it's a one-liner fix:

Suggested change

if strings.EqualFold(existing, name) {

if existing == name {

Fixed — changed to == for case-sensitive comparison.

st3penta · 2026-05-20T15:25:08Z

+  The ec command line should produce correct results for input validation using the OPA evaluator
+
+  Background:
+    Given the environment variable is set "EC_USE_OPA=1"


validate input never checks IsOpaEnabled() — it always creates a conftest evaluator (see internal/evaluation_target/input/input.go). These acceptance tests set EC_USE_OPA=1 but silently run against conftest, giving false confidence in OPA coverage.

Either wire IsOpaEnabled() into the input validation path, or rename/tag these tests to make clear they're validating conftest behavior.

Good find — wired IsOpaEnabled() into internal/evaluation_target/input/input.go so NewInput() now uses NewOPAEvaluator when EC_USE_OPA=1 is set, matching the image validation path.

st3penta · 2026-05-20T15:25:08Z

+	if resultCount < ruleCount {
+		successes += ruleCount - resultCount


Success padding uses ruleCount (total rule occurrences across modules) instead of len(ruleNames) (unique rules actually evaluated). If the same rule name appears in multiple files within a namespace — normal for multi-file Rego packages — ruleCount exceeds len(ruleNames), and the padding inflates the success count.

Suggested change

if resultCount < ruleCount {

successes += ruleCount - resultCount

if resultCount < len(ruleNames) {

successes += len(ruleNames) - resultCount

Fixed — switched to len(ruleNames) and removed the now-unused ruleCount variable.

st3penta · 2026-05-20T15:25:08Z

+
+		for _, rr := range ruleResults {
+			if len(exceptions) > 0 {
+				continue


When exceptions match, continue drops the rule result entirely — no entry in Failures, Warnings, or Exceptions with the original message. Downstream consumers and audit logs can't distinguish "rule passed" from "rule failed but was excepted".

No current conforma policies use the conftest exception mechanism, so this is latent. But if exceptions are ever adopted, the result will be invisible suppression with no audit trail.

Agree this is worth tracking. The current behavior matches conftest's semantics — exceptions are recorded in outcome.Exceptions (line 304) but the original rule result message isn't preserved there. No current conforma policies use the conftest exception mechanism, so this is latent. I'll file a follow-up to add the original message to the exception entry when we revisit exception handling.

st3penta · 2026-05-20T15:25:08Z

+	return nil
+}
+
+func (b *basePolicyEvaluator) downloadAndInspectPolicies(ctx context.Context) error {


downloadAndInspectPolicies has ~80 lines of complex logic — policy download, annotation inspection, GitHub/GitLab error URL rewriting (4 nested conditions), and rule collection — with no direct unit tests. This is the shared infrastructure both evaluators depend on, so bugs here have wide blast radius.

Unit tests for annotation parsing (annotated vs bare rules) and URL rewriting (GitHub, GitLab, raw, non-matching) would be high-value.

This logic was moved from the conftest evaluator where it had the same coverage gap. The existing integration and acceptance tests exercise it end-to-end, but I agree direct unit tests for annotation parsing and URL rewriting would be high-value. I'll add those in a follow-up.

st3penta · 2026-05-20T15:25:08Z

 	}

-	inputPath, inputJSON, err := a.WriteInputFile(ctx)
+	inputMap, inputJSON, err := a.BuildInput(ctx)


BuildInput() and WriteInputFile() are both called unconditionally, each independently constructing and marshaling the same Input struct. On the OPA path the disk write is wasted; on conftest the in-memory build is wasted.

The two evaluators also consume different input sources from EvaluationTarget (ParsedInput vs Inputs file path). Conftest silently ignores ParsedInput, making the contract implicit. If the two representations ever diverge (marshal differences, TOCTOU on the file), results will differ silently.

Unifying the input path — both from memory or both from disk — would eliminate the duplicate work and the dual-source contract.

Intentional for now — both evaluator types need their respective input format, and the cost of the extra marshal is negligible compared to policy download and evaluation. Unifying to a single input path requires changing the conftest runner to accept parsed input, which is a larger change. Agreed it's worth doing eventually to eliminate the dual-source contract risk.

st3penta · 2026-05-20T15:25:08Z

+	return name
+}
+
+func (o *opaEvaluator) queryNamespace(ctx context.Context, fileName string, input any, namespace string) (Outcome, error) {


queryNamespace is 70+ lines mixing five concerns: rule iteration, dedup, exception querying, rule evaluation, result categorization, and success counting. Extracting per-rule evaluation (exception check + rule query + categorization) into a helper would make the logic easier to follow and test independently.

Done — extracted collectRuleNames (rule iteration + dedup) and evaluateRule (exception check + rule query + categorization) out of queryNamespace. The orchestrator now just loops over rules and accumulates results. Added TestCollectRuleNames and TestEvaluateRule to test the helpers independently.

st3penta · 2026-05-20T15:25:08Z

 	sc.Step(`^a track bundle file named "([^"]*)" containing$`, createTrackBundleFile)
+	sc.Before(func(ctx context.Context, _ *godog.Scenario) (context.Context, error) {
+		if v := os.Getenv("EC_USE_OPA"); v != "" {
+			ctx, _ = theEnvironmentVarilableIsSet(ctx, "EC_USE_OPA="+v)


The error from theEnvironmentVarilableIsSet is discarded. If this call fails, EC_USE_OPA is never set and the OPA acceptance tests silently run against conftest — passing without testing the OPA path at all.

Suggested change

ctx, _ = theEnvironmentVarilableIsSet(ctx, "EC_USE_OPA="+v)

ctx, err := theEnvironmentVarilableIsSet(ctx, "EC_USE_OPA="+v)

if err != nil {

return ctx, fmt.Errorf("setting EC_USE_OPA: %w", err)

}

Fixed — the error is now checked and returned, wrapped with context.

st3penta · 2026-05-20T15:25:08Z

+
+func TestEnsureInitialized(t *testing.T) {
+	t.Run("returns error from init", func(t *testing.T) {
+		expectedErr := fmt.Errorf("init failed")


The "returns error from init" subtest sets expectedErr and o.initErr, then immediately overrides both — o.initOnce gets a fresh sync.Once and o.initErr is reset to nil. expectedErr is never read again. The test ends up exercising a filesystem error (no policy sources), not the intended "pre-set error" scenario.

Either remove expectedErr and rename the test to match what it actually covers, or fix it to assert on the pre-set error.

Fixed — removed the dead expectedErr code and renamed the test to "returns error when no policy sources" to match what it actually exercises.

- Add default case to type switch in evalOPAQuery to surface unexpected result types instead of silently dropping them (fail-open fix) - Use case-sensitive comparison for rule name dedup (OPA rules are case-sensitive) - Fix success padding to use unique rule count instead of total rule occurrences across modules - Wire IsOpaEnabled() into input validation path so EC_USE_OPA=1 actually uses the OPA evaluator for validate input - Propagate error from theEnvironmentVarilableIsSet in acceptance test setup - Fix and rename broken "returns error from init" test - Extract collectRuleNames and evaluateRule helpers from queryNamespace for better separation of concerns and testability - Add TestBuildInput, TestCollectRuleNames, TestEvaluateRule, and unexpected result type test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

joejstuart marked this pull request as draft May 6, 2026 23:07

github-actions Bot added the size: XXL label May 6, 2026

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

joejstuart force-pushed the opa-evaluator branch from 8989ef7 to 419435d Compare May 7, 2026 02:40

joejstuart force-pushed the opa-evaluator branch 2 times, most recently from 8dbf508 to adbdb0a Compare May 7, 2026 14:27

joejstuart and others added 6 commits May 7, 2026 10:44

joejstuart force-pushed the opa-evaluator branch from f83a8cb to 2e5d0f7 Compare May 7, 2026 15:58

joejstuart force-pushed the opa-evaluator branch from 2e5d0f7 to 6f4868b Compare May 7, 2026 16:24

test: add snapshot files for OPA evaluator acceptance tests

14ba351

Generated by running the OPA acceptance test scenarios locally. These snapshots enable CI to validate OPA evaluator output matches expected results. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

joejstuart marked this pull request as ready for review May 7, 2026 18:03

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

fix: use effectiveTime instead of time.Now() in computeSuccesses

09561ba

Success filtering was using time.Now() instead of the evaluator's effective time, causing inconsistent behavior for backdated runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

Comment thread internal/evaluator/conftest_evaluator.go

st3penta reviewed May 20, 2026

View reviewed changes

		if resultCount < ruleCount {
		successes += ruleCount - resultCount

Conversation

joejstuart commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

joejstuart commented May 8, 2026

Uh oh!

simonbaird commented May 8, 2026

Uh oh!

st3penta left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

joejstuart commented May 6, 2026 •

edited

Loading

coderabbitai Bot commented May 6, 2026 •

edited

Loading

codecov Bot commented May 7, 2026 •

edited

Loading