Refactored actions to use uvx, eliminated black, removed various shims by VisLab · Pull Request #1251 · hed-standard/hed-python

VisLab · 2026-03-02T17:04:20Z

No description provided.

Copilot

Pull request overview

This PR updates CI workflows to run Python tooling via uvx and significantly expands schema comparison test coverage, including new assertions around schema “extras” DataFrame diffing.

Changes:

Switch GitHub Actions steps for Ruff/Black/Codespell/Mdformat to use uvx instead of creating a venv and installing tools manually.
Add extensive unit tests for SchemaComparer (matching tags, schema diffs, extras DataFrame comparisons, pretty-print formatting).
Adjust SchemaComparer extras comparison to ignore the internal in_library column during key/column selection.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`tests/schema/test_schema_compare.py`	Large expansion/refactor of tests for `SchemaComparer`, including extras DataFrame diffing and output formatting checks.
`hed/scripts/schema_script_util.py`	Docstring updates to reflect filtering out warnings (typo introduced).
`hed/schema/schema_compliance.py`	Removes backward-compatibility shim module.
`hed/schema/schema_attribute_validators.py`	Removes backward-compatibility shim module.
`hed/schema/schema_comparer.py`	Adds section-name constants for extras and updates extras diffing to exclude the `in_library` column.
`.github/workflows/ruff.yaml`	Runs Ruff linter + formatter check via `uvx`.
`.github/workflows/mdformat.yaml`	Migrates mdformat to `uvx` (currently broken YAML/script).
`.github/workflows/codespell.yaml`	Runs codespell via `uvx`.
`.github/workflows/black.yaml`	Runs Black via `uvx`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.github/workflows/mdformat.yaml

hed/scripts/schema_script_util.py

.github/workflows/ruff.yaml

hed/schema/schema_comparer.py

github-actions · 2026-03-02T17:19:34Z

PR Review: Used uvx in some actions

Summary

The CI workflow simplification (replacing venv+pip with uvx) is a good idea. The schema comparer cleanups and expanded test suite are solid. However there are two bugs in the workflow files that will break CI.

Critical

mdformat.yaml — shell step is broken

The run: | (multiline literal block) was kept but its content was replaced with run: uvx .... Bash will attempt to execute the literal string run: as a command, which does not exist. The step will always fail with run:: command not found. Fix: replace run: | with a plain run: (single-line).

black.yaml — missing newline at end of file

The file ends without a trailing newline. POSIX requires text files to end with a newline; some tools will flag this.

Important

hed/scripts/schema_script_util.py line 38 — docstring typo

"non-warningcompliance issues" is missing a space; should be "non-warning compliance issues".

Suggestions

ruff.yaml now runs uvx ruff format --check . in addition to ruff check . — a new enforcement not previously in CI. Worth confirming the codebase is already clean under ruff format before merging so existing PRs do not suddenly fail.
The expanded SchemaComparer test suite (split into TestFindMatchingTags, TestCompareSchemas, TestGatherSchemaChanges, TestExtrasChanges, TestCompareDataFrames, TestPrettyPrintChangeDict) is a clear improvement in structure and coverage.
Removing schema_attribute_validators.py and schema_compliance.py is safe: no production code imports from those shim paths.

.github/workflows/mdformat.yaml

hed/scripts/schema_script_util.py

.github/workflows/black.yaml

github-actions · 2026-03-02T20:55:15Z

PR Review

This PR refactors CI workflows to use uvx for ephemeral tool execution, removes the Black formatter in favour of ruff's formatter, tidies up backward-compat shims, and expands SchemaComparer test coverage. The direction is correct, but there are two concrete defects to fix before merging.

Critical

mdformat.yaml: broken shell command

The run: | block scalar was kept in place while its content was changed to start with run: uvx .... Bash will try to execute run: as a command name, which does not exist, so the step will fail immediately. The fix is to drop the | and use a plain scalar:

      - name: Run mdformat (check only)
        run: uvx mdformat --check --wrap no --number docs/ *.md

(or keep | but remove the extra run: prefix inside the block).

Important

Docstring typo in hed/scripts/schema_script_util.py

Line in the new docstring reads "non-warningcompliance" — missing a space between non-warning and compliance.

Deletion of backward-compat shims

hed/schema/schema_attribute_validators.py and hed/schema/schema_compliance.py were public re-export shims. Removing them silently breaks any external code that imports from those module paths (e.g. from hed.schema.schema_compliance import check_compliance). The project policy in CLAUDE.md prefers not to keep shims, which is fine — but the deletion should be called out as a breaking change in the changelog / release notes so downstream users know to update their imports to the canonical paths (hed.schema.schema_validation.compliance, hed.schema.schema_validation.attribute_validators).

Suggestions

test_changes_sorted_by_severity pins that every section in gather_schema_changes output is ordered Major→Minor→Patch→Unknown. If that ordering is not guaranteed by the implementation (i.e. it is a side-effect of dict insertion order rather than an explicit sort), the test will be flaky. Worth confirming the sort is intentional and documented.
test_score_lib_version_counts hard-codes exact tag counts (21 / 10 / 80) against released SCORE schemas. This already existed in the old tests, but it is worth noting that these numbers will need updating if the compared schema versions ever change or if the comparison logic changes.
The new TestSchemaComparerConstants tests that SECTION_ENTRY_NAMES_PLURAL contains the extras keys. Since SECTION_ENTRY_NAMES_PLURAL is not visible in the schema_comparer.py diff, make sure it is actually updated to include SOURCES, PREFIXES, and ANNOTATION_PROPERTY_EXTERNAL; otherwise those three test methods will raise AttributeError or fail immediately.

.github/workflows/mdformat.yaml

hed/scripts/schema_script_util.py

github-actions · 2026-03-02T21:08:37Z

PR Review: Refactored actions to use uvx, eliminated black, removed various shims

Overall, this is a clean CI modernization — switching to uvx reduces setup boilerplate and the ruff format enforcement replaces black consistently. The schema_comparer.py refactor and expanded test suite are improvements. A few issues are worth addressing before merge.

Important

1. mdformat.yaml — mdformat-myst plugin missing

The previous workflow explicitly installed mdformat-myst>=0.1.5. The new uvx mdformat call runs without it. This project uses MyST-parser for Sphinx docs (docs/), and without the mdformat-myst plugin, mdformat will treat MyST-specific constructs (role/directive fences, colon fences, etc.) as plain text — either producing incorrect formatting or failing with a parse error.

Fix:
```yaml
run: uvx --with mdformat-myst mdformat --check --wrap no --number docs/ *.md
```

2. Shim removal (schema_compliance.py, schema_attribute_validators.py) — undocumented breaking change

Both files were explicitly marked as backward-compatible shims with docstrings pointing users to the canonical import paths. Their removal is a breaking change for any downstream code importing from hed.schema.schema_compliance or hed.schema.schema_attribute_validators. This should be called out in the changelog / release notes so users know which version dropped these aliases.

Suggestion

ruff.yaml — two separate uvx ruff invocations

Running uvx ruff check . && uvx ruff format --check . creates two separate ephemeral environments. In practice both will resolve to the same latest ruff on a fresh runner, but pinning the version (e.g. uvx ruff==0.9.x) would prevent surprise CI failures when a new ruff release changes formatting or lint rules. The existing pyproject.toml already has ruff>=0.8.0 as a dev dependency — using a concrete pin in CI would be more stable.

No issues found in

schema_comparer.py logic changes — the removed DF_EXTRAS branch and _add_extras_section_changes method were dead code (extras keys are DataFrames in schema.extras, not HedSection entries, so they never appear in unequal_entries); removal is correct
_add_extras_changes in_library column exclusion — correct fix; the in_library column is internal metadata and should not drive diff output
pyproject.toml [tool.ruff.format] section — correctly mirrors black defaults for a seamless transition
f-string implicit concatenation cleanup — all string values are equivalent after collapsing adjacent literals
New test_schema_compare.py test classes — good coverage of the extras diff logic; the test_changes_sorted_by_severity test correctly relies on _sort_changes_by_severity that is already called in gather_schema_changes

.github/workflows/mdformat.yaml

github-actions · 2026-03-02T21:35:02Z

PR Review: Refactored actions to use uvx, eliminated black, removed various shims

Summary

The CI/workflow changes are clean and well-structured. The source code cleanups (f-string merges, blank-line removal) are all correct and no-op. The new test suite for schema_comparer is a substantial improvement. Two issues need attention before merge.

Important

1. Duplicate string constants (schema_comparer.py lines 16–18)

SchemaComparer.SOURCES, PREFIXES, and ANNOTATION_PROPERTY_EXTERNAL are redefined as class constants with the same string values already in df_constants.SOURCES_KEY, PREFIXES_KEY, and EXTERNAL_ANNOTATION_KEY. The test file even imports both sets separately (from hed.schema.schema_io.df_constants import SOURCES_KEY, PREFIXES_KEY, EXTERNAL_ANNOTATION_KEY). If either definition drifts the comparison logic silently breaks. These should be class-level aliases:

from hed.schema.schema_io.df_constants import SOURCES_KEY, PREFIXES_KEY, EXTERNAL_ANNOTATION_KEY

class SchemaComparer:
    SOURCES = SOURCES_KEY
    PREFIXES = PREFIXES_KEY
    ANNOTATION_PROPERTY_EXTERNAL = EXTERNAL_ANNOTATION_KEY

2. Removal of schema_compliance.py shim is a breaking import path change

hed/schema/schema_compliance.py explicitly re-exported check_compliance, SchemaValidator, _IssuesListWithSummary, CONTENT_SECTIONS, DOMAIN_TO_SECTION, and SECTION_TO_DOMAIN. While these aren't in hed/__init__.py, any downstream code doing from hed.schema.schema_compliance import check_compliance will now get an ImportError. This is a breaking change. Since the project policy does allow shim removal when unused, at minimum this should be noted in the changelog so users can update their imports to hed.schema.schema_validation.compliance.

Suggestion

3. Extras-section deletions classified as Minor instead of Major

In _add_extras_changes, a row present in schema1 but absent from schema2 (a deletion) is tagged "Minor". Regular tag deletions in _add_removed_items are "Major". If the asymmetry is intentional (extras are metadata, not structural), a brief comment to that effect would clarify the design for future maintainers.

4. ruff.yaml runs two separate uvx processes

uvx ruff check . && uvx ruff format --check . spins up two ephemeral environments. Functionally fine due to caching, but a single uvx ruff invocation with ruff check . && ruff format --check . would be cleaner. Not blocking.

.github/workflows/ruff.yaml

hed/schema/schema_compliance.py

hed/schema/schema_comparer.py

github-actions · 2026-03-02T21:58:50Z

hed/errors/error_messages.py

-        f"Invalid tag {tag} found in definition for {def_name}. "
-        f"Def, Def-expand, and Definition tags cannot be in definitions."
-    )
+    return f"Invalid tag {tag} found in definition for {def_name}. Def, Def-expand, and Definition tags cannot be in definitions."


Important: This line is ~131 characters, exceeding both the old 127-char limit and the new 120-char ruff limit set in this PR. The previous multi-line format was correct:

Suggested change

return f"Invalid tag {tag} found in definition for {def_name}. Def, Def-expand, and Definition tags cannot be in definitions."

return (

f"Invalid tag {tag} found in definition for {def_name}. "

f"Def, Def-expand, and Definition tags cannot be in definitions."

)

github-actions · 2026-03-02T21:58:56Z

hed/errors/schema_error_messages.py

-        f"Tag '{tag}' has annotation '{annotation_value}' with prefix '{prefix}' "
-        f"that is not defined in the Prefixes section."
-    )
+    return f"Tag '{tag}' has annotation '{annotation_value}' with prefix '{prefix}' that is not defined in the Prefixes section."


Important: This line is ~130 characters, exceeding both the old 127-char limit and the new 120-char ruff line-length set in this PR. The previous multi-line format was correct:

Suggested change

return f"Tag '{tag}' has annotation '{annotation_value}' with prefix '{prefix}' that is not defined in the Prefixes section."

return (

f"Tag '{tag}' has annotation '{annotation_value}' with prefix '{prefix}' "

f"that is not defined in the Prefixes section."

)

github-actions · 2026-03-02T21:59:02Z

hed/schema/hed_schema.py

-        self.extras = (
-            {}
-        )  # Used to store any additional data that might be needed for serialization (like OWL or other formats)
+        self.extras = {}  # Used to store any additional data that might be needed for serialization (like OWL or other formats)


Important: This line is 128 characters (exceeds both the old 127-char and new 120-char limits). The original three-line form kept the comment readable:

Suggested change

self.extras = {} # Used to store any additional data that might be needed for serialization (like OWL or other formats)

self.extras = (

{}

) # Used to store any additional data that might be needed for serialization (like OWL or other formats)

Alternatively, shorten or move the comment to a preceding line.

github-actions · 2026-03-02T21:59:07Z

pyproject.toml

+line-ending = "auto"
+
 [tool.black]
 line-length = 127


Important: [tool.ruff] line-length is now 120 (line 122) but [tool.black] still says 127. Since black is removed from dev dependencies and CI in this PR, the entire [tool.black] section should be removed to avoid confusion about the enforced line length.

github-actions · 2026-03-02T21:59:13Z

tests/schema/test_schema_compare.py

+        _, not_in_1, not_in_2, unequal = comp.compare_schemas(attribute_filter=HedKey.InLibrary)
+        self.assertEqual(len(not_in_1[HedSectionKey.Tags]), 21)
+        self.assertEqual(len(not_in_2[HedSectionKey.Tags]), 10)
+        self.assertEqual(len(unequal[HedSectionKey.Tags]), 80)


Suggestion: The expected counts (21, 10, 80) are tightly coupled to the internal content of specific released schema versions. If those schemas are ever regenerated or the comparison logic changes, these assertions will fail with no obvious reason. At minimum, add a comment documenting where these numbers come from and that they are intentionally pinned.

github-actions · 2026-03-02T21:59:19Z

.github/workflows/ruff.yaml

@@ -24,13 +24,5 @@ jobs:
          enable-cache: true
          cache-dependency-glob: "**/pyproject.toml"



Suggestion: Two separate uvx ruff invocations fetch and install ruff twice. Prefer a single step that calls both sub-commands:

Suggested change

run: uvx ruff check . && uvx ruff format --check .

(No change needed if that is acceptable to you — it works correctly, just slightly redundant.)

github-actions · 2026-03-02T21:59:35Z

PR Review: Refactored actions to use uvx, eliminated black, removed various shims

Summary

The CI infrastructure changes (switching to uvx, dropping black, adding ruff format --check) are clean and well-structured. The schema_comparer.py refactor — moving constants from string literals to df_constants values, removing dead DF_EXTRAS set and _add_extras_section_changes, and fixing the _in_library column exclusion — is a clear improvement. Test coverage in test_schema_compare.py is substantially expanded.

Two issues need fixing before merge.

Important

1. Line-length regressions in error message files

Three error message return statements that were previously correctly split across multiple lines have been collapsed into single lines of 128–131 characters, exceeding both the old 127-char limit and the new 120-char ruff line-length introduced in this PR. E501 is ignored by ruff lint, so these won't break CI, but they are a style regression introduced by this PR.

hed/errors/error_messages.py:367 — ~131 chars
hed/errors/schema_error_messages.py:214 — ~130 chars
hed/schema/hed_schema.py:49 — ~128 chars (inline comment on self.extras = {})

Inline suggestions posted on each line.

2. [tool.black] section still has line-length = 127

[tool.ruff] line-length was lowered to 120 and black was removed from dev deps and CI, but the [tool.black] section in pyproject.toml still reads line-length = 127. Since black is no longer used, this section is dead but misleading. Remove it entirely (or at least update it to 120).

Suggestions

test_score_lib_version_counts: hard-coded counts (21, 10, 80) are fragile — a comment explaining the expected values would help maintainers.
ruff.yaml: two separate uvx ruff invocations fetch ruff twice; minor inefficiency.

Used uvx in some actions

1b946f4

VisLab requested a review from Copilot March 2, 2026 17:04

Copilot started reviewing on behalf of VisLab March 2, 2026 17:04 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

.github/workflows/mdformat.yaml Outdated Show resolved Hide resolved

hed/scripts/schema_script_util.py Outdated Show resolved Hide resolved

.github/workflows/ruff.yaml Show resolved Hide resolved

hed/schema/schema_comparer.py Show resolved Hide resolved