Prep extra for release: cleanup, security fixes, dep bumps#66
Merged
Conversation
…er, and benchmark infrastructure Core library changes: - Replace dense float[len_a * len_b] consistency bonus matrix with sparse per-row structure (K slots per row). Fixes integer overflow crash for large DNA families (profiles > 46k columns) and reduces memory from 10 GB to 3.2 MB for 50k x 50k case. Bit-exact identical results for all existing cases. - Add ensemble alignment with POAR consensus merging, configurable number of runs, and Hirschberg midpoint perturbation for diversity - Add alignment-guided UPGMA tree rebuild (realign) within each ensemble run - Add sequence weight rebalancing for profile merging - Add variable scoring matrix (VSM) support - Add anchor consistency bonus for progressive alignment guidance - Add detailed alignment comparison (recall/precision/F1/TC) with BAliBASE XML core block mask support - Expose new parameters through C API, CLI, and Python bindings Benchmark infrastructure: - Add NSGA-III multi-objective optimizer with pymoo mixed variables (Choice/Integer/Real) for proper categorical parameter exploration - Add benchmark datasets: BAliBASE, BRAliBASE, MDSA DNA, BaliFam100 - Add Pareto front visualization with Dash app - Validate downloaded tarballs to detect corrupt/HTML error pages - Add fallback URLs for BAliBASE downloads Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Simplify Python API to mode-based presets (fast/default/accurate) as the only public interface. Remove all legacy backward-compatibility code and deprecated parameters (ensemble, refine, vsm_amax, etc. from align*()). Add per-run support for vsm_amax and refine in the NSGA-III optimizer via optional array parameters in ensemble_custom_file_to_file(). The C binding also accepts per-run seq_weights, realign, consistency_anchors, and consistency_weight. This lets the optimizer discover diverse ensemble configurations where each run uses different scoring and refinement. - kalign_run_config: 14-field per-run struct, single C entry point - Public API: mode + optional gap penalty overrides only - Optimizer: ensemble_custom_file_to_file() with per-run arrays - Search space: 41 dimensions (7 per-run × 5 slots + 6 shared) - Old v1 checkpoints can still be resumed (auto-expanded to per-run) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d presets for protein/RNA/DNA Major features: - POAR-based consistency merge for ensemble alignment: extracts pairwise residue consistency scores from the POAR table (already built during ensemble) and uses them as bonus weights in a final progressive alignment. Controlled by kalign_ensemble_config.consistency_merge (0=POAR consensus, 1=consistency re-alignment). Tested by NSGA-III optimizer but POAR consensus consistently outperformed it on BAliBASE. - Kimura two-parameter nucleotide substitution matrices (1PAM, 20PAM, 200PAM) with kappa=2 transition/transversion ratio. Gives the optimizer matrix diversity for DNA/RNA, analogous to PFASUM43/60/CB66 for protein. K200 strongly preferred by optimizer for both RNA and DNA. - Farthest-first anchor selection for guide tree distance computation. Replaces length-stratified sampling with BPM-based diversity selection. Performance equivalent on BAliBASE but more principled. - 12 NSGA-III optimized presets: 4 protein (BAliBASE gen 41), 4 RNA (BRAliBASE gen 88), 4 DNA (MDSA gen 100). Each biotype has fast, default, recall, and accurate modes. Nucleotide optimization run (combined BRAliBASE + MDSA) in progress to produce unified presets. Protein presets (BAliBASE, 218 cases): fast: R=0.815 P=0.674 F1=0.732 7s (single P60) default: R=0.786 P=0.722 F1=0.747 40s (single P60, inline refine) recall: R=0.841 P=0.728 F1=0.776 370s (ens5 CB66/P60/P43, realign=1) accurate: R=0.787 P=0.845 F1=0.807 502s (ens5 P43/CB66, realign=1) RNA presets (BRAliBASE, 599 cases — all beat MUSCLE F1=0.825): fast: R=0.832 P=0.825 F1=0.828 5s (single K200) default: R=0.804 P=0.869 F1=0.835 8s (ens3 K200/K20) recall: R=0.833 P=0.826 F1=0.829 6s (single K200) accurate: R=0.811 P=0.863 F1=0.836 26s (ens3 K200/K20, realign=2) DNA presets (MDSA, 325 cases): fast: R=0.741 P=0.788 F1=0.764 18s (ens3 K200/K20, realign=1) default: R=0.737 P=0.816 F1=0.775 35s (ens3 K200/K20, realign=1) recall: R=0.760 P=0.770 F1=0.765 65s (ens5 K200, realign=1) accurate: R=0.737 P=0.816 F1=0.775 35s (=default, optimizer converged) Optimizer changes: - Objectives changed from (F-beta, TC, time) to (recall, precision, time) for richer Pareto fronts - Combined "nucleotide" dataset (BRAliBASE + MDSA) for unified optimization - Checkpoint backfill for consistency_merge and per-run vsm_amax/refine - Dashboard shows recall/precision columns, consistency merge status - Pareto front seed merging from separate RNA + DNA checkpoints Other changes: - kalign_ensemble_config gains consistency_merge and consistency_merge_weight - msa_struct gains poar_consistency void* for non-owning POAR reference - aln_run.c dispatches poar_consistency before anchor_consistency - pick_anchor.h exposes pick_anchor_n() for configurable anchor count - Python __init__.py adds "recall" to _PRESET_MODES Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…et_nucleotide() Replaces preset_rna() and preset_dna() with a single preset_nucleotide() optimized on combined BRAliBASE (599 RNA) + MDSA (325 DNA) dataset. Nucleotide presets (combined BRAliBASE + MDSA, gen 100): fast: R=0.792 P=0.788 F1=0.790 5s (single K200, realign=1) default: R=0.773 P=0.842 F1=0.806 26s (ens3 K20/K200, realign=1) recall: R=0.800 P=0.796 F1=0.798 17s (single K200, realign=1, refine=C) accurate: R=0.760 P=0.867 F1=0.810 100s (ens5 K200/K1, realign=1, ms=3) Dispatch is now: protein vs nucleotide (no RNA/DNA distinction). Kalign auto-detects sequence type; user only picks mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Self-contained threadpool library in lib/src/threadpool/ with: - Lock-free Chase-Lev deques (per-worker LIFO) + global ext queue - Three parallelism patterns: parallel_for, fork-join groups, recursive tasks - Event-count sleeping, per-worker group recycling, work-stealing - 17 unit tests, 10 stress tests (TSan-clean), OpenMP comparison benchmarks - Standalone CMake build (also builds as part of kalign) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace OpenMP with a built-in Chase-Lev work-stealing thread pool as the default parallelization backend. Falls back to serial if pthreads is unavailable. OpenMP remains available via -DUSE_OPENMP=ON. Threadpool is 1.5-2.5x faster than OpenMP across protein and nucleotide benchmarks on both ARM (M3) and x86-64 (Threadripper) hardware. Key changes: - Threadpool ON / OpenMP OFF by default in CMakeLists.txt - All parallel regions (distance matrix, k-means, Hirschberg, anchor selection, pairwise distances) wired for both backends - tp_parallel_for_chunked() with configurable minimum chunk size - Compile-time parallelization thresholds in one place (CMakeLists.txt): ALN_SERIAL_THRESHOLD=500, KMEANS_UPGMA_THRESHOLD=50, DIST_MIN_SEQS=50, PFOR_MIN_CHUNK=10 - macOS Python wheels use threadpool (no libomp.dylib dependency) CLI unified with Python API — both use NSGA-III optimized mode presets via kalign_get_mode_preset() + kalign_align_full(): kalign --mode fast|default|recall|accurate Gap penalty overrides (--gpo/--gpe/--tgpe) work with all modes. Removed dead CLI options (--ensemble, --refine, --consistency, etc.) that are now managed by mode presets. All three entry points (C CLI, Python API, kalign-py CLI) expose identical options and produce identical results. Bug fixes: - bisectingKmeans: base-case guard for num_samples <= 1 - bisectingKmeans: out-of-bounds seed_idx in split2 k-means loop - threadpool.c: missing #include <stdint.h> for uintptr_t Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Major refactor of the C library internals for better threading: API consolidation: - Replace 7 redundant entry points (kalign_run, kalign_run_seeded, kalign_run_dist_scale, kalign_run_realign, kalign_post_realign, kalign_ensemble, kalign_ensemble_custom) with one internal kalign_single_run() + kalign_align_full() public entry point - One threadpool created per kalign_align_full() call, shared across all work (ensemble runs, tree traversal, anchor consistency, etc.) Parallelized components (all via threadpool fork-join or parallel-for): - Inline refine tree traversal (create_msa_tree_inline_refine) - Hirschberg fwd/bwd within inline refine edges - Concurrent ensemble runs (5 runs share the global pool) - Anchor consistency build (N×K pairwise DPs) - POAR extraction (per-pair, disjoint writes) - POAR scoring (per-row accumulation + sequential reduction) - Consensus candidate enumeration (two-pass count+fill) - Residue confidence computation (per-sequence) Realign tree improvement: - Replace O(N³) UPGMA on N×N distance matrix with O(N·K·log N) bisecting k-means on N×K anchor distances from aligned sequences - Add pair_dist_fn callback to bisecting_kmeans for pluggable leaf cluster distance computation (BPM for initial tree, identity for realign) - No N×N matrix allocated at any point in the realign path Quality: fast/default byte-identical to previous version. Recall F1 -0.001, accurate F1 -0.004 on BAliBASE (218 cases, XML core block scoring). Threading is deterministic across thread counts. Speedup at 8 threads (DSSim 1000 sequences): fast 2.1x, default 2.4x, recall 1.3x, accurate 1.4x Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Expose per-column ensemble confidence scores to users. Low-confidence columns can be masked (lowercase residues) or removed (replaced with gaps), useful for phylogenetics and structure prediction pipelines where uncertain alignment regions should be excluded. C library: - kalign_mask_by_confidence(msa, threshold, style) in msa_op.c - kalign_write_confidence(msa, path) for raw score output - Styles: KALIGN_MASK_LOWERCASE (default), KALIGN_MASK_REMOVE CLI: - --confidence-threshold FLOAT (0-1, requires ensemble mode) - --confidence-style (lowercase/remove) - --confidence-output FILE Python API: - kalign.mask_alignment(result, threshold, style) → AlignedSequences - kalign.filter_alignment(result, threshold) → AlignedSequences - kalign.write_confidence(path, result) Gracefully warns and skips when confidence is unavailable (non-ensemble modes). 11 Python tests covering all paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New feature: align new sequences against an existing alignment without re-aligning the existing sequences. Builds a consensus profile from the existing alignment, aligns each new sequence via seq-to-profile Hirschberg DP, and inserts gaps to match the column structure. Strict mode (default): no new columns are introduced. Insertions in new sequences relative to the existing alignment are dropped. C library: - kalign_add_sequences(existing, new_seqs, n_threads) in aln_add.c - kalign_read_sequences() allows single-sequence input (for --add) - Consensus profile built from column frequencies + substitution scores CLI: - kalign --add new.fa --existing aligned.fa -o combined.fa - Both --add and --existing required together - Existing sequences preserved byte-identical in output Python API: - kalign.add_to_alignment(existing, new_seqs, output, format, n_threads) - _core.add_to_alignment_file() pybind11 binding Tests: 6 Python tests (basic add, existing unchanged, residue preservation, alignment length, file-not-found, larger dataset with 44 sequences). 15/15 C tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mark runner - Add missing #include <mm_malloc.h> with HAVE_AVX2 guards in aln_apair_dist.c and aln_wrap.c (fixes undefined symbol on GCC 10) - Update build.zig to zig 0.15 API (addLibrary/createModule), add aln_add.c - Rewrite benchmark runner: replace --refine with --mode (fast/default/recall/accurate), simplify work dispatch, add per-category SP/Prec/F1/TC summary tables - Containerfile: clone from git instead of COPY, no external binary dependencies - Containerfile.downstream: same git-based approach Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, defensive buffer init align_from_file_mode in _core.cpp wrote alignments to a fixed path /tmp/kalign_output.fa and read them back. Parallel benchmark subprocesses contending for this file produced corrupted output (wrong sequence counts, wrong names, embedded null bytes). Replaced with direct reads from the msa struct. finalise_alignment now asserts that every sequence's len + sum(gaps) matches the aln_len computed from sequence 0, converting silent gap-invariant violations into clear errors. Also memsets the linear buffer to '-' so any unwritten positions stay as gap characters rather than uninitialized memory on glibc. Same defensive init applied to kalign_msa_to_arr and the strict-mode gapped buffer in aln_add, where partial population is plausible. msa_seq_cpy fix: copy gaps[src->len] instead of gaps[src->alloc_len]. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Refreshes uv.lock via `uv lock --upgrade`. All flagged transitive deps (cryptography, flask, werkzeug, pillow, pygments, pytest, urllib3, requests, mako, black) now resolve to versions past their CVE fixes. pip-audit reports no known vulnerabilities. Test suite unaffected.
The threadpool is now the default parallelism backend (USE_OPENMP=OFF, USE_THREADPOOL=ON), so the OpenMP env var has no effect. Removed from both CIBW_ENVIRONMENT in wheels.yml and the [tool.cibuildwheel.environment] block in pyproject.toml.
The C library, CLI, and Python all expose four NSGA-III-derived presets: fast, default, recall, accurate. The 'precise' name was a deprecated alias for 'accurate' in the Python layer only; drop it now that 'extra' is heading to main. - python-kalign/__init__.py: remove MODE_PRECISE constant, the precise->accurate aliasing branch with DeprecationWarning, and the __all__ entry. Update ValueError text to list all four modes. - lib/include/kalign/kalign.h: add 'recall' to the kalign_get_mode_preset docstring (was listing only fast/default/accurate). - README-python.md: rewrite Modes table to show all four modes with the correct CLI syntax (--mode <name>, not --fast/--precise which never existed); update Quick-Start and Ensemble examples accordingly. - tests: replace test_precise_mode cases with recall/accurate equivalents; drop MODE_PRECISE from constants assertion and expected exports.
Prepares the repo for public release by dropping material that belongs
to the manuscript pipeline rather than the kalign library itself.
Optimizers (moved to ~/Work/Documents/Manuscripts/2026_kalign_35/
scripts/optimizers before deletion; preserved there for reproducibility
of the NSGA-III preset numbers shipped in 3.5):
- benchmarks/optimize_params.py
- benchmarks/optimize_unified.py
- benchmarks/optimize_ensemble.py
- benchmarks/optimize_parallel.py
- benchmarks/PRD_unified_optimizer.md
- benchmarks/PRD_ensemble_optimizer.md
Analysis / visualisation scripts (paper repo has its own equivalents):
- benchmarks/analysis.py
- benchmarks/app.py
- benchmarks/view_pareto.py
- benchmarks/mumsa_plots.py, mumsa_precision.py
- benchmarks/combined_improvements.py
- benchmarks/full_comparison.py
- benchmarks/external_balibase.py
- benchmarks/make_summary_figure.py
- benchmarks/eval_checkpoint_configs.py
- benchmarks/vsm_ensemble_experiment.py
- benchmarks/bench_quality_timing.py
- benchmarks/run_balibase_comparison.py
Other paper-side material:
- benchmarks/PRD_kalign_align_full.md (superseded by docs/PRD-parameter-cleanup.md)
- docs/PRD-benchmark-repo-update.md (hard-coded paper-repo paths)
- Containerfile.downstream (explicitly labelled paper container)
Kept:
- benchmarks/runner.py, datasets.py, scoring.py (invoked by CI workflow benchmark.yml)
- benchmarks/downstream/* (covered by tests/python/test_downstream_integration.py)
- Containerfile, Containerfile.memcheck
- docs/PRD-{msa-consistency,confidence-masking-and-add-sequences}.md
- PRD_sparse_consistency.md
Verified: benchmark package still imports; pytest tests/python/ passes
(170 passed, 1 pre-existing test_module_exports failure unrelated to
this cleanup).
Six high-confidence dead items identified during pre-release audit.
No callers anywhere in the source tree; not compiled, not exposed
via the public API or bindings.
- lib/src/coretralign.{c,h}: pthread-based scheduler superseded by
lib/src/threadpool/; uncompiled (commented out in lib/CMakeLists.txt).
- lib/src/mod_tldevel.h: 10-line wrapper header with zero includes.
- lib/src/bpm.c bitShiftRight256ymm(): AVX2 helper, never called.
Carried a stale "FIXME: not sure if this is correct!!!" comment.
- lib/src/bisectingKmeans.c split(): replaced by split2() (parallel
variant); never called.
- python-kalign/io.py: unused 'import os'.
- python-kalign/utils.py: unused 'Tuple' import.
Verified: ctest 15/15 pass; pytest tests/python/ 170 pass (1 pre-existing
test_module_exports failure unchanged). ~547 lines removed in total.
- PRD_sparse_consistency.md, docs/PRD-msa-consistency.md: replace references to `optimize_unified.py` with a note pointing at the manuscript repository's scripts/optimizers/ copy. The optimizer itself was moved out of kalign in commit dd01498. - docs/PRD-parameter-cleanup.md: add to the repo. Adds a brief status note acknowledging that a fourth mode (`recall`) was added during implementation; the rest of the architecture description is current.
Build now compiles warning-free.
Warning fixes:
- lib/src/euclidean_dist.c: drop unused local 'd2' in the UTEST_EDIST
main() block.
- lib/src/msa_io.c: drop unused local 'line_len' (set on every iteration
of the line-scan loop but never read).
- lib/src/msa_io.c: change 'size_t nread' to 'ssize_t nread' so the
getline() return-value check against -1 is signed-correct.
- tests/kalign_lib_testCXX.cpp: cast string-literal initialisers to
char* so the array-of-char* initialisation no longer warns under
-Wwritable-strings.
CI cleanup:
- .github/workflows/{cmake,python,benchmark}.yml: drop apt 'libomp-dev'
and brew 'libomp' installs. Threadpool is the default parallelism
backend (USE_OPENMP=OFF, USE_THREADPOOL=ON) so libomp is no longer
needed for these jobs. The explanatory comment in wheels.yml remains.
The four PRDs and the parameter-cleanup integration guide were internal planning documents describing work that's now complete. The current state of the API is documented in the public C header, the READMEs, the CLI --help, the Python docstrings, and the ChangeLog. The design rationale, where it's still relevant, lives in git history. Removed: - PRD_sparse_consistency.md - docs/PRD-msa-consistency.md - docs/PRD-confidence-masking-and-add-sequences.md - docs/PRD-parameter-cleanup.md - docs/parameter-cleanup-integration.md The docs/ directory is now empty and dropped from the tree; verified no README, code, or CI workflow references any of these files.
Two related fixes in the binary POAR loader (lib/src/poar.c). Both guard against malformed POAR files supplied via --load-poar. 1. Pair-count overflow on numseq >= 65536. The expression `numseq * (numseq - 1) / 2` was evaluated in uint32_t and then cast to int, producing wrap to a negative value for numseq >= 65536 and outright uint32 overflow for numseq >= 65537. Now computed in uint64_t and rejected if it exceeds INT_MAX. 2. Per-pair n_entries unbounded by file. The 32-bit per-pair entry count was cast to int (could wrap negative) and multiplied by sizeof(struct poar_entry) without overflow check. Now capped at INT_MAX / sizeof(struct poar_entry). Verified: ctest 15/15 pass; pytest tests/python/ 170 pass (one pre-existing test_module_exports failure unchanged).
The hard-coded expected_exports set was stale relative to __all__ in python-kalign/__init__.py — four symbols added in earlier commits weren't reflected here: - add_to_alignment (from 53abded, --add mode) - filter_alignment (from 1002f0a, confidence masking) - mask_alignment (from 1002f0a) - write_confidence (from 1002f0a) Full pytest tests/python/ now passes clean: 171 pass, 0 fail.
The earlier warning-cleanup commit (fb0d5c4) removed `float d2;` from euclidean_dist.c's UTEST_EDIST main(), trusting an Apple Clang -Wunused-variable warning. The warning was correct *for the Apple Silicon build*, where edist_utest is compiled with -DNOHAVE_AVX2 and the `#ifdef HAVE_AVX2` block containing the only use of d2 is preprocessed out. On Linux GCC builds with -DHAVE_AVX2, the block IS compiled and `edist_256(..., &d2)` references the now-missing variable — breaking cmake.yml, benchmark.yml, codeql, and wheels builds. Restore the declaration but guard it with the same `#ifdef HAVE_AVX2` as its only consumer, so neither codepath warns.
The Benchmark workflow has been silently broken for months because the BAliBASE download endpoint (http://www.lbgi.fr/balibase/...) is gone, and the Wayback Machine archive returns HTML instead of the tarball. Both failure modes are caught and surfaced by datasets.py, but the result is that the workflow has been failing on every push and the github-action-benchmark gh-pages history is stale. Serious benchmark tracking lives in the manuscript repository (~/Work/Documents/Manuscripts/2026_kalign_35/) which runs the full BAliBASE / BaliFam100 / MDSA suite via Snakemake. Removing the broken CI job is cleaner than carrying it indefinitely. If CI-side perf-regression detection is wanted later, the small 3-case BAliBASE subset in tests/data/ (BB11001, BB12006, BB30014) is the right starting point for a smoke benchmark.
The DSSIM stress test was failing under ASAN on Linux. Root cause: the test built a kalign_run_config via kalign_run_config_defaults() (which returns a protein-oriented config with matrix = PFASUM43) and fed DNA sequences to it. The library correctly refused with "Detected DNA sequences but a protein matrix was selected" and returned FAIL. The test ignored the FAIL return and proceeded into kalign_msa_compare with two un-finalized MSAs, where in turn sort_msa_for_comparison ran with alnlen falling back to seq[0]->len and read past the original (tight-packed) seq->seq buffer that dssim_get_fasta had allocated, triggering the heap-buffer-overflow. Production (CLI + Python bindings) is unaffected because both go through kalign_get_mode_preset(...), which picks a biotype-appropriate matrix automatically. This is a test-side bug that exposed two latent issues in the library error-handling. Fix #1 (root cause): tests/dssim_test.c sets cfg.matrix = KALIGN_MATRIX_AUTO so the library resolves the matrix per biotype, and wraps kalign_align_full / kalign_msa_compare in RUN() so any future alignment failure aborts the test instead of being silently ignored. Fix #2 (defence-in-depth): lib/src/msa_cmp.c — wrap the finalise_alignment(r/t) calls inside kalign_msa_compare, kalign_msa_compare_detailed, and kalign_msa_compare_with_mask in RUN() so a finalise failure propagates as a clean error instead of silently passing a half-finalized MSA into the comparator. Fix #3 (latent library bug): lib/src/msa_op.c — make finalise_alignment atomic w.r.t. msa->sequences. Previously the per-sequence loop replaced seq->seq pointers one at a time; if make_linear_sequence failed on seq[k] the loop bailed leaving seq[0..k-1] swapped to the new buffer and seq[k..numseq-1] still pointing at the original (smaller) buffer — a structurally inconsistent state that any subsequent code reading the MSA would trip over. Now two-pass: build and validate every linear_seq first, then swap pointers only after the whole batch is verified. On error the MSA's buffers are untouched. Fix #4 (API ergonomics): lib/include/kalign/kalign.h — header comment on kalign_run_config_defaults explaining the protein-by-default behaviour and pointing callers at KALIGN_MATRIX_AUTO for biotype auto-selection. Verified: 15/15 ctest pass on macOS clang and on Linux GCC under ASAN inside the memcheck container; 171/171 pytest tests/python/ pass.
Packages the local pre-push checks into one command so the
working tree can be self-verified before pushing.
Phases (~5 min total):
1. zig build Cross-compile sanity across aarch64-macos,
aarch64-linux, x86_64-linux-{gnu,musl}.
Catches GCC-vs-Clang divergence.
2. cmake + ctest Native macOS Release build + 15 ctests.
3. podman ASAN Ubuntu container with kalign built under ASAN
and the full ctest suite. Catches Linux glibc
behaviour that Apple's malloc hides.
4. pytest Python bindings + ecosystem integration (~171
tests).
Each phase is independent — no short-circuit, so a single failing
phase doesn't prevent the others from running and reporting.
Skips gracefully if a tool isn't installed (e.g. podman) or its
environment isn't ready (machine not running).
Usage:
tests/check-local.sh # run all four (~5 min)
tests/check-local.sh --quick # skip Linux ASAN (~30s)
tests/check-local.sh --help
Exits 0 only if every non-skipped phase passes.
Brings the file into compliance with the version of black used by the python.yml CI lint job. Purely mechanical reformatting (line wrapping of multi-argument calls); no behaviour change. Verified: pytest tests/python/ still passes 171/171.
Updated in all four locations so the build artefacts agree: pyproject.toml (PyPI metadata) CMakeLists.txt (KALIGN_LIBRARY_VERSION_PATCH) build.zig (zig cross-compile package version) ChangeLog (release notes summary) Release theme: this is the first version that ships the four-mode preset system as the stable public interface (fast / default / recall / accurate), with the Chase-Lev threadpool replacing OpenMP as the default parallelism backend and macOS wheels no longer linking libomp.dylib — closing the conda-forge / numpy OpenMP runtime conflict reported on the issue tracker. Verified: kalign --version reports 3.5.2; kalign.__version__ reports 3.5.2; pre-push checklist (tests/check-local.sh --quick) green prior to this commit.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Just a test