Skip to content

Changed schema error vs warning strategy#1252

Merged
VisLab merged 5 commits intohed-standard:mainfrom
VisLab:fix_extras
Mar 4, 2026
Merged

Changed schema error vs warning strategy#1252
VisLab merged 5 commits intohed-standard:mainfrom
VisLab:fix_extras

Conversation

@VisLab
Copy link
Member

@VisLab VisLab commented Mar 3, 2026

No description provided.

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

PR Review: Changed schema error vs warning strategy

Summary: This PR introduces a check_warnings parameter to schema validation functions and adds a separate_issues utility method. The CI cleanup (dedicated ruff.yaml already handles linting; the Windows cache-glob addition is correct) is fine. The core logic has two issues and one docstring inconsistency to address.


Important issues

1. separate_issues silently drops intermediate-severity issues

ErrorSeverity.ERROR = 1 and ErrorSeverity.WARNING = 10. The current partition:

errors   = [... if issue["severity"] <= 1]   # severity 1 only
warnings = [... if issue["severity"] >= 10]  # severity 10+ only

Any future severity value in the range 2–9 would be silently excluded from both lists. The method name implies a complete partition. Consider severity > ErrorSeverity.ERROR for the warnings side so no issue can fall through the gap.

2. validate_schema_object docstring title contradicts the new default

The first sentence still reads "checking non-warning compliance" but the new signature has check_warnings=True—meaning warnings are included in the default call. The summary line needs updating to reflect that warnings are now part of the default compliance check.

3. Inconsistent check_warnings defaults

function default
validate_schema_object check_warnings=True
validate_schema check_warnings=False

A caller who uses validate_schema_object directly gets a different default than the wrapper validate_schema. One of these should be changed, or both should share the same default with the difference clearly documented.


Suggestions

  • validate_schema docstring: Missing blank line between the last param description and the Returns: section.
  • Tests: contextlib.redirect_stdout(None) sets sys.stdout to None; if the function under test ever calls print() the test crashes with AttributeError. Prefer redirect_stdout(io.StringIO()) for robustness.
  • test_warning_schema_check_warnings_true_reports_issues: Consider asserting the specific warning code rather than just assertTrue(issues) so the test fails explicitly if the trigger mechanism changes.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts how schema validation treats warnings vs errors by adding a check_warnings switch to schema validation utilities and introducing an issue-splitting helper in the error reporter. It also updates CI caching settings and modifies the coverage workflow steps.

Changes:

  • Add check_warnings parameter support to validate_schema_object / validate_schema and add tests covering warning suppression/reporting behavior.
  • Add ErrorHandler.separate_issues plus unit tests for separating errors vs warnings.
  • Update GitHub Actions workflows (uv cache dependency glob; remove Ruff lint step from coverage workflow).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
hed/scripts/schema_script_util.py Adds check_warnings control to schema validation flow and forwards it into object validation.
tests/scripts/test_script_util.py Adds tests to confirm warning inclusion/suppression behavior for schema validation utilities.
hed/errors/error_reporter.py Introduces ErrorHandler.separate_issues helper to split issue lists by severity.
tests/errors/test_error_reporter.py Adds unit tests for ErrorHandler.separate_issues.
.github/workflows/ci_windows.yaml Adds uv cache dependency glob configuration.
.github/workflows/ci_cov.yaml Removes Ruff linting step from the coverage workflow and simplifies dependency install.
Comments suppressed due to low confidence (2)

hed/scripts/schema_script_util.py:58

  • When check_warnings=True, any warnings from base_schema.check_compliance() will trigger the early return and skip the roundtrip format validation, even though warnings are typically non-fatal. If the intent is “fail only on errors but optionally report warnings”, consider separating errors vs warnings (or filtering to errors for the early-return decision) so roundtrip checks still run when there are warnings only. Also consider passing check_for_warnings=check_warnings into check_compliance() instead of post-filtering by severity to keep behavior aligned with the schema validator’s API.
        issues = base_schema.check_compliance()
        if not check_warnings:
            issues = [issue for issue in issues if issue.get("severity", ErrorSeverity.ERROR) == ErrorSeverity.ERROR]
        if issues:
            error_message = get_printable_issue_string(issues, title=schema_name)
            validation_issues.append(error_message)
            return validation_issues

.github/workflows/ci_cov.yaml:44

  • This workflow change removes the Ruff install+lint step from ci_cov.yaml. If linting is still intended to gate merges, confirm the dedicated Ruff workflow remains required/required-status in branch protection; otherwise, this change may unintentionally reduce CI coverage compared to before.

      # Install dependencies
      - name: Install dependencies
        run: uv pip install -e ".[docs,test]"

      # Run unittest with coverage
      - name: Test with unittest and coverage
        env:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

PR Review: Changed schema error vs warning strategy

Overall the change is well-structured and the test coverage is solid. One behavioral bug needs fixing before merge; the rest are minor or already handled correctly.


Important

validate_schema_object skips roundtrip checks when check_warnings=True and only warnings exist

The early-return guard fires for any non-empty issues list (schema_script_util.py lines 54-57):

issues = base_schema.check_compliance()
if not check_warnings:
    issues = [... only errors ...]
if issues:                                  # fires even for warnings-only
    validation_issues.append(error_message)
    return validation_issues                # roundtrip checks never run

When check_warnings=True and the schema contains only warnings (e.g. SCHEMA_PRERELEASE_VERSION_USED), the function reports the warning but immediately returns, silently skipping all four roundtrip-format checks. Warnings are non-fatal by design and should not gate the roundtrip validation.

Minimal fix — separate the early-exit decision from the warning-inclusion decision:

issues = base_schema.check_compliance()
errors_only = [i for i in issues if i.get("severity", ErrorSeverity.ERROR) == ErrorSeverity.ERROR]
report_issues = issues if check_warnings else errors_only
if report_issues:
    validation_issues.append(get_printable_issue_string(report_issues, title=schema_name))
if errors_only:          # only hard errors block roundtrip
    return validation_issues

The existing test test_warning_schema_check_warnings_true_reports_issues passes today because it only checks that issues are reported — it does not verify that the roundtrip checks also ran.


False positives in prior automated reviews (no action needed)

"Inconsistent defaults" — both validate_schema_object and validate_schema default to check_warnings=False. The table in the github-actions comment was incorrect.

"Silent gap in separate_issues" — the implementation uses > ErrorSeverity.ERROR (i.e. > 1) for the warnings bucket, so every issue lands in exactly one list. There is no gap. The docstring correctly describes this partition.


Suggestions

  • test_warning_schema_check_warnings_true_reports_issues: combined is built before assertTrue(issues). If the list is empty the assertIn error message is misleading. Reorder to assertTrue then build combined then assertIn.
  • validate_schema docstring is missing a blank line between the last Parameters entry and Returns: (minor style nit).

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

Review Summary

Three critical bugs in hed/scripts/schema_script_util.py will cause runtime failures, plus an important CI regression.


Critical: ImportError on module load

separate_issues is a @staticmethod on ErrorHandler, not a module-level name. Line 6:

from hed.errors.error_reporter import separate_issues   # fails at import time

raises ImportError: cannot import name 'separate_issues'. Every caller of schema_script_util (CLI, CI validation scripts) will break immediately.

Fix: Import ErrorHandler and call ErrorHandler.separate_issues(issues), or expose a module-level alias in error_reporter.py.


Critical: Wrong keyword argument to check_compliance

Line 51 calls base_schema.check_compliance(check_warnings=check_warnings) but the actual parameter is check_for_warnings, not check_warnings. This raises TypeError: got an unexpected keyword argument 'check_warnings'.

Fix: base_schema.check_compliance(check_for_warnings=check_warnings)


Critical: Logic bug duplicates errors in issue list

Lines 52-54:

if issues and check_warnings:
    errors, warnings = separate_issues(issues)
    issues = errors + issues   # BUG: errors appear twice

separate_issues partitions so errors + warnings == issues, making errors + issues equal to errors + errors + warnings. Every error is double-counted in the formatted output.

Fix: Do not reassign issues; just extract errors for the early-return gate:

if check_warnings:
    errors, _ = separate_issues(issues)
else:
    errors = issues

Important: ruff linting removed from CI

ci_cov.yaml drops the ruff check step entirely. Linting is no longer enforced in any CI workflow, meaning style and correctness violations will go undetected in future PRs.

try:
issues = base_schema.check_compliance()
issues = [issue for issue in issues if issue.get("severity", ErrorSeverity.ERROR) == ErrorSeverity.ERROR]
issues = base_schema.check_compliance(check_warnings=check_warnings)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: The keyword argument name is wrong. The actual check_compliance signature is:

def check_compliance(self, check_for_warnings=True, name=None, error_handler=None)

Passing check_warnings= raises TypeError: check_compliance() got an unexpected keyword argument 'check_warnings'.

Suggested change
issues = base_schema.check_compliance(check_warnings=check_warnings)
issues = base_schema.check_compliance(check_for_warnings=check_warnings)

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

PR Review: Changed schema error vs warning strategy

Summary

The PR cleanly separates error/warning handling in schema validation. The new separate_issues function and the check_warnings parameter for validate_schema_object/validate_schema are well-designed. Tests are solid and the redirect_stdout(None) → io.StringIO() fix is correct.

Two issues need attention before merge:


Important — Breaking public API change

replace_tag_references was previously exported from hed.errors as a module-level function and documented in docs/api/errors.rst. It is now only accessible as ErrorHandler.replace_tag_references. Any downstream code using:

from hed.errors import replace_tag_references

will raise ImportError after this change. No deprecation shim was added.

Recommended fix: Either add a backward-compat re-export in hed/errors/__init__.py:

replace_tag_references = ErrorHandler.replace_tag_references  # deprecated; use ErrorHandler.replace_tag_references

or bump the major version and document the break in the changelog.


Suggestion — Unexplained reassignment in validate_schema_object

At line 53 in hed/scripts/schema_script_util.py:

if issues and check_warnings:
    errors, warnings = separate_issues(issues)
    issues = errors + warnings  # appears to be a no-op

errors + warnings contains the same items as the original issues (just reordered: errors first, then warnings). If the intent is to ensure errors appear before warnings in the printed report, a one-line comment would make that clear and prevent the line from being removed as dead code in a future cleanup.


Non-issues (confirmed)

  • Ruff removed from ci_cov.yaml: fine — covered by the dedicated ruff.yaml workflow.
  • check_compliance(check_for_warnings=...) parameter: already present in HedSchema/HedSchemaGroup, so the call-site change is correct.
  • separate_issues using <= ErrorSeverity.ERROR: logically equivalent to == with current severity values; more defensive for future additions.

@VisLab VisLab merged commit 65977fd into hed-standard:main Mar 4, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants