Harden test suite against silent-skip patterns#38
Merged
Merged
Conversation
7222d49 to
51d6f35
Compare
In tests/test-proctitle-low-stack.sh: the -gt 8192 heuristic was a no-op on macOS Apple Silicon hosts whose default RLIMIT_STACK is 8176 KiB, so the regression test never actually capped the host stack. A wider audit surfaced the same shape across the suite: regression checks that quietly stopped checking on common configurations. Proctitle coverage: - tests/test-proctitle-low-stack.sh: drop broken -gt heuristic. Apply an unconditional 1024 KiB soft-stack cap, env-overridable via PROCTITLE_LOW_STACK_KIB. The new cap sits well below every observed macOS default and well above the ~560 KiB floor where elfuse cannot start. Distinct exit codes 98/99 separate ulimit-setup failure from elfuse failure. - tests/test-proctitle-host.c (new, host-side native test): synthesizes a contiguous argv block with its NUL terminator at the last writable byte of a page, with the next page mapped PROT_NONE and a sentinel environ string mmapped above it. The fixed runtime_set_process_title walks only the contiguous argv block and stores byte-by-byte through a volatile pointer; any overshoot or a reverted argv+envp upper-bound walk SIGSEGVs against the guard. Verified by restoring the pre-fix proctitle.c and observing rc=139. Makefile gains a build rule that links against the project's proctitle.o so the exact in-tree code is exercised, no HVF entitlement required. Driver correctness (tests/driver.sh): - evaluate_result no longer reports OK for rc=0 when expected_rc=N is set. The previous rc==0 OR (expected AND rc==expected) accepted a buggy exit of 0 against an explicit non-zero expectation; test-complex declares expected_rc=42 and was the existing silent-pass victim. - ALLOW_MISSING_BINARIES default flips from auto (skip-on-missing for any non-canonical TESTDIR) to 0 (strict). Callers that want permissive-skip-mode now set ALLOW_MISSING_BINARIES=1 explicitly. Recipe-level exit propagation (mk/tests.mk): - test-sysroot-rename and test-sysroot-create-paths gain set -e. The earlier semicolon-chained recipes ran post-conditions after the elfuse invocation, so a non-zero elfuse exit was swallowed whenever the residual filesystem state satisfied the checks. Test-runner timeout discipline (tests/lib/test-runner.sh): - run() and run_check() wrap every invocation in timeout \$TEST_TIMEOUT. The asymmetry with run_pipe / run_timeout let a deadlocked elfuse hang make check forever. - A new _test_runner_epoch_us helper (bash 5.0 EPOCHREALTIME) disambiguates a real harness timeout from the guest's own timeout(1) returning rc=124. The two share an exit code; comparing microsecond elapsed against TEST_TIMEOUT * 1_000_000 is the only reliable distinguisher, and the seconds-resolution SECONDS alternative could undercount by almost a full second at small timeouts. Coreutils optional-binary accounting (tests/lib/coreutils-suite.sh): - Raw if [ -e "\$BIN/X" ]; then run_check ...; fi guards around base32, basenc, sha224sum, sha384sum, b2sum, sum, and numfmt are removed. Missing tools now route through test_skip_missing_tool inside run_check, which reports SKIP with accounting under TEST_SKIP_MISSING_TOOLS=1 (smoke profile) and a hard FAIL under the full profile, instead of silently erasing the assertion. Matrix runner accounting (tests/test-matrix.sh): - New require_binary() and skip_suite() helpers always increment the skip counter and emit a visible skip line. 14 raw existence guards plus 3 silent suite-level drops (static-bins, dynamic-coreutils musl/glibc) now go through them, so missing fixtures surface in the per-mode summary instead of looking like a full pass. - Calls use the if require_binary X Y; then ... fi form, not require_binary X Y && test_check .... Under set -e the chain form propagates the helper's return-1 as the calling function's exit status, aborting the script when the last optional binary in a function happened to be missing. - test_check and test_pipe now require rc==0 before trusting the regex. A crashing tool that printed the expected substring before dying used to be reported OK; the precondition matches the corrected driver.sh evaluate_result. Perf benchmark integrity (tests/test-perf.sh): - benchmark() captures each sample's exit status. PERF_FAILED accumulates and the script exits 1 if any sample failed. The previous ... || true swallowed every failure, so a missing native binary, an elfuse crash, or a host SIP block degraded into median 0 ms PASS. - The cat|wc pipelines now run under bash -c "set -o pipefail; ..." so a failing producer surfaces through the rc capture. The outer script's pipefail does not propagate into sh -c children. Additional /proc test hardening (in test-proc-fidelity.c): - test_proc_oom_score_write_fails opens O_WRONLY, the path the test name claims to cover. Non-root EACCES becomes an explicit PROCFS_SKIP with accounting instead of a silent PASS that hid the actual write-rejection coverage. - test_proc_oom_adj_reread_tracks_score_adj_updates, test_proc_oom_adj_scaling, and test_proc_oom_adj_same_fd_roundtrip fail hard when /proc/self/oom_adj is missing. The earlier silent PASS turned each regression into a no-op on any host that did not ship the legacy compat node. Close #37
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In tests/test-proctitle-low-stack.sh: the -gt 8192 heuristic was a no-op on macOS Apple Silicon hosts whose default RLIMIT_STACK is 8176 KiB, so the regression test never actually capped the host stack. A wider audit surfaced the same shape across the suite: regression checks that quietly stopped checking on common configurations.
Proctitle coverage:
Driver correctness (tests/driver.sh):
Recipe-level exit propagation (mk/tests.mk):
Test-runner timeout discipline (tests/lib/test-runner.sh):
Coreutils optional-binary accounting (tests/lib/coreutils-suite.sh):
Matrix runner accounting (tests/test-matrix.sh):
Perf benchmark integrity (tests/test-perf.sh):
Additional /proc test hardening (in test-proc-fidelity.c):
Close #37
Summary by cubic
Hardens the test suite to remove silent skips, enforce strict exit/timeout handling, and split coverage into deterministic suites for proctitle, syscalls, /proc, and fd-family. make check now fails on hangs and missing fixtures with clear PASS/FAIL/SKIP accounting.
New Features
test-proctitle-hostlinked against in-treeproctitle.o; runs inmake check.PROCTITLE_LOW_STACK_KIB, with distinct exit codes for ulimit setup vs guest failure.test-syscall-fidelity(fchmodat2, getcpu, openat2 RESOLVE_*, O_PATH, madvise, low-hint mmap) andtest-fd-family(signalfd EFAULT-preserves-pending); split/procintotest-proc-fidelitywith explicit SKIPs and fail-hard on missingoom_adj. Manifest reorganized into/proc, syscall fidelity, and fd-family sections./mnt/fusewhenSYSROOT_DIRis absent.Refactors
evaluate_resultnow requires exactexpected_rc; defaultALLOW_MISSING_BINARIES=0(strict).run/run_checkintimeout; use bashEPOCHREALTIMEto detect harness timeouts; timeouts are FAILs.test_skip_missing_tool,require_binary,skip_suite); require rc==0 before trusting regex.set -efor sysroot rename/create-paths.bash -c "set -o pipefail; ..."to surface producer failures.Written for commit d3b4800. Summary will update on new commits. Review in cubic