Skip to content

Multi pass decoding#256

Open
oscarhiggott wants to merge 4 commits into
mainfrom
multi-pass-decoding
Open

Multi pass decoding#256
oscarhiggott wants to merge 4 commits into
mainfrom
multi-pass-decoding

Conversation

@oscarhiggott
Copy link
Copy Markdown
Collaborator

Multi-pass decoding for Tesseract

Summary

This PR introduces a multi-pass decoding framework for the Tesseract decoder. The key idea is to exploit correlations between different error classes (e.g., X-type and Z-type errors in a CSS code, see https://arxiv.org/abs/1401.6975 and https://arxiv.org/abs/1310.0863) by decomposing the decoding problem into independent components, decoding them in sequence across multiple passes, and using the results of earlier passes to reweight error probabilities in later passes.

This approach is inspired by correlated decoding strategies (e.g., two-pass PyMatching), but is implemented as a more general-purpose, $N$-pass framework that can be used by more general decoders, and on circuits that are not necessarily matchable.

Motivation

In CSS codes, some physical errors (e.g., $Y$ errors) can affect detectors associated with different error bases. A single-pass decoder treats all detectors monolithically, which can scale poorly and fails to exploit the structure of correlated noise.

Multi-pass decoding addresses this by:

  1. Partitioning detectors into independent components using a user-supplied classifier (e.g., by Pauli basis).
  2. Decomposing the detector error model (DEM) so that each error mechanism affects detectors from only one component.
  3. Sequentially decoding components across passes, using correlation-derived reweighting rules to improve accuracy on later passes.

Mathematical framework

DEM decomposition

Given a DEM error instruction with probability $p$ that flips detectors from multiple classes (e.g., $D_X \wedge D_Z$), the decomposition replaces it with independent component errors. The observable assignment is resolved by enumerating all consistent assignments across components via symmetric difference constraints:

$$\bigoplus_k O_k = O_{\text{original}}$$

where $O_k$ is the observable assignment for component $k$ and $\oplus$ denotes symmetric difference.

Correlation extraction

Before decomposition, the raw DEM is analyzed to extract cross-component correlations. For each pair of hyperedges $(h_A, h_B)$ that co-occur in a decomposed error instruction with probability $p$:

  • Marginal probabilities are accumulated via XOR composition: $P(h) = P(h)(1-p) + p(1-P(h))$
  • Joint probabilities $P(h_A \wedge h_B)$ are tracked similarly for co-occurring hyperedges
  • Conditional probabilities are derived as: $P(h_B \mid h_A) = P(h_A \wedge h_B) / P(h_A)$

These conditional probabilities form the reweighting rules: if error $A$ in component $C_A$ is predicted in pass $k$, the likelihood cost of error $B$ in component $C_B$ is updated using $P(B \mid A)$ for pass $k+1$.

Pass scheduling

Two scheduling strategies are implemented:

Static scheduling: All components are decoded in every pass. Simple but potentially redundant.

Causal scheduling: Pass sets $S_1, S_2, \ldots, S_N$ are determined by back-propagation from the final pass:

  • $S_N = {C : C \text{ affects at least one logical observable}}$
  • $S_k = {C : \exists \text{ error } A \in C \text{ that reweights an error } B \in C' \text{ for some } C' \in S_{k+1}}$

This ensures that only components whose predictions can influence the final logical outcome are decoded, reducing unnecessary computation.

Reweighting and cost update

When an error with internal index $i$ is predicted in component $C_A$ during a non-final pass, all associated reweighting rules are applied:

$$\text{cost}(B) \leftarrow -\log\left(\frac{P(B \mid A)}{1 - P(B \mid A)}\right)$$

where the conditional probability $P(B|A)$ is capped at $0.499$ to prevent divergent (negative) costs. This is implemented via Error::set_with_probability, which converts probability to the log-likelihood-ratio cost used internally by Tesseract:

$$c(e) = -\log\left(\frac{p}{1-p}\right)$$

In-place internal cost resynchronization

An important design goal is that reweighting does not require reloading or reconstructing the DEM. The Tesseract decoder pre-computes several internal data structures at construction time from the DEM, and the reweighting step modifies error costs in place on the already-constructed decoder, then incrementally resynchronizes only the affected structures. This is what makes multi-pass decoding efficient — each component decoder is constructed once and reused across all shots and passes.

Tesseract's internal data structures

The Tesseract decoder maintains three key data structures that depend on error costs:

  1. errors[i].likelihood_cost — The log-likelihood-ratio cost of each error $i$, used directly in the A* search as the edge weight when expanding a candidate error.

  2. error_costs[i] — A cached ErrorCost struct storing both likelihood_cost and min_cost = likelihood_cost / |detectors(i)|. The min_cost field is a normalized cost-per-detector used for early termination in get_detcost.

  3. d2e[d] — For each detector $d$, the list of error indices whose symptom includes $d$, sorted in ascending order of min_cost. This sorted order is critical to the decoder's performance: in get_detcost, the decoder iterates through d2e[d] to find the minimum-cost unblocked error touching detector $d$, and uses the sorted order to break early once the remaining errors cannot possibly improve the current best:

    for (int ei : d2e[d]) {
        ec = error_costs[ei];
        // Early termination: if this error's cost (normalized by its own
        // detector count) already exceeds the current best (normalized by
        // the best's detector count), all subsequent errors are worse.
        if (ec.likelihood_cost * min_det_cost_det_count >=
            min_cost * errors[ei].symptom.detectors.size())
            break;
        // ... check if error is unblocked and update min_cost
    }

    Other structures (eneighbors, edets, the DEM itself) depend only on the graph topology, not on costs, and are therefore unaffected by reweighting.

The update_internal_costs method

When a reweighting rule modifies errors[j].likelihood_cost on a target component's decoder, the cached error_costs[j] and the sort order of every d2e[d] list containing $j$ become stale. The new update_internal_costs method resynchronizes these incrementally:

void TesseractDecoder::update_internal_costs(
        const std::vector<size_t>& modified_error_indices) {
    std::unordered_set<int> affected_detectors;
    for (size_t ei : modified_error_indices) {
        // 1. Recompute the cached ErrorCost for this error
        error_costs[ei] = {
            errors[ei].likelihood_cost,
            errors[ei].likelihood_cost / errors[ei].symptom.detectors.size()
        };
        // 2. Collect all detectors touched by this error
        for (int d : edets[ei]) {
            affected_detectors.insert(d);
        }
    }
    // 3. Re-sort d2e only for affected detectors
    for (int d : affected_detectors) {
        std::sort(d2e[d].begin(), d2e[d].end(),
            [this](size_t a, size_t b) {
                return error_costs[a].min_cost < error_costs[b].min_cost;
            });
    }
}

The key properties of this approach:

  • No DEM reload: The DEM is parsed once at construction. Reweighting operates directly on the in-memory errors vector and the cached cost structures.
  • Incremental update: Only the error_costs entries and d2e lists for detectors that are actually touched by modified errors are recomputed. For a reweighting step that modifies $k$ errors touching $m$ detectors total, the cost is $O(m \cdot L \log L)$ where $L$ is the average length of a d2e list — far cheaper than reconstructing the decoder from scratch.
  • Surgical reset: After the final pass, the original costs are restored from a saved copy (original_costs), and update_internal_costs is called again with the same modified indices to restore the sort order. This ensures the decoder is in a clean state for the next shot without any reconstruction.

Performance

The primary motivation for multi-pass Tesseract is speed: by decomposing the DEM into smaller independent components, each component decoder operates on a much smaller graph, resulting in a >100× wall-clock speedup compared to running full (single-pass) Tesseract on the monolithic DEM. The goal is to achieve this speedup with only a modest accuracy penalty.

Accuracy validation status

Accuracy validation is not yet complete. For $d = 3$ and $d = 5$ surface codes, two-pass Tesseract matches two-pass PyMatching (the expected baseline for codes where two-pass PyMatching is applicable). However, at $d = 7$ there is a small accuracy regression. This bug will be fixed in a PR stacked on top of this one.

Architecture

New files

File Description
src/multi_pass_tesseract_decoder.{h,cc} Core MultiPassTesseractDecoder class with static/causal scheduling
src/error_correlations.{h,cc} Correlation extraction: marginal, joint, and conditional probability computation
src/dem_decomposition.{h,cc} DEM decomposition by detector class, error splitting, and DEM merging
src/tanner_graph.{h,cc} Union-Find-based connected component analysis of the Tanner graph
src/bern_utils.{h,cc} Bernoulli probability utilities
src/multi_pass_sinter_compat.pybind.h pybind11 bindings for MultiPassSinterDecoder and MultiPassSinterCompiledDecoder
src/py/tesseract_decoder/sinter_decoders.py Python sinter.Decoder wrapper for multi-pass decoding

Modifications to existing files / Restructuring

File Change
setup.py Added to support standard compilation of C++ extensions and pip-installable local builds via Bazel.
src/py/ Restructured Python Package from _tesseract_py_util to tesseract_decoder.utils, introducing __init__.py for cleaner top-level re-exports and modular importing.
src/tesseract.{h,cc} Added update_internal_costs() for incremental cost resynchronization after reweighting; added early return in decode_to_errors for empty syndromes; added TesseractDebugger friend class
src/tesseract.pybind.cc Renamed Pybind11 module from tesseract_decoder to _core and registered multi-pass pybind11 bindings.
src/BUILD New build targets for all multi-pass C++ components and tests, updated pybind extension name and dependencies.
CMakeLists.txt CMake build support for new sources and Python packaging output directories.

Python API

The multi-pass decoder is exposed as a sinter.Decoder via MultiPassSinterDecoder:

import tesseract_decoder

decoder = tesseract_decoder.MultiPassSinterDecoder(
    num_passes=2,
    detector_classifier=lambda idx, coords, tag: 0 if '"basis": "X"' in tag else 1,
    pqlimit=200000,
)

The detector_classifier callable receives (detector_index, coordinates, tag_string) and returns an integer class ID. Detectors with the same class ID are grouped into the same component.

Test coverage

  • src/multi_pass_tesseract_decoder.test.cc: Tests for two-pass correlation benefit, disjoint decoding, causal schedule construction, surface code partitioning, causal scheduling with surface codes, and perfect reset verification.
  • src/dem_decomposition.test.cc: Tests for DEM decomposition, symmetric difference, observable assignment, DEM splitting and merging.
  • src/tanner_graph.test.cc: Tests for Tanner graph connected component analysis.
  • src/error_correlations.test.cc: Tests for correlation extraction pipeline.
  • src/py/multi_pass_bindings_test.py: Python integration tests for MultiPassSinterDecoder.
  • src/py/stub_test.py: Updated stub tests for new public API surface.

Reorganise the Python package layout:
- Rename pybind module from tesseract_decoder to _core
- Move _tesseract_py_util to tesseract_decoder.utils with relative imports
- Add tesseract_decoder/__init__.py with top-level re-exports
- Add sinter_decoders.py with MultiPassSinterDecoder wrapper
- Add setup.py for pip-installable builds via Bazel
- Update stub_test.py for new API surface
- Update CMakeLists.txt and BUILD for new module name
Prepare TesseractDecoder for multi-pass decoding support:
- Add update_internal_costs() for incremental resynchronisation of
  internal cost structures (error_costs, d2e sort order) after
  external modification of error likelihoods
- Add early return in decode_to_errors for empty syndromes
- Add TesseractDebugger friend class for test access to internals
- Reserve error_costs capacity before initial fill
- Fix int/size_t mismatch in flip_detectors_and_block_errors
- Update and simplify tesseract tests
Add foundational libraries for multi-pass decoding:
- bern_utils: Bernoulli probability utilities (log-likelihood
  conversion, probability clamping)
- tanner_graph: Union-Find-based connected component analysis of
  the detector-error Tanner graph
- error_correlations: Correlation extraction pipeline computing
  marginal, joint, and conditional error probabilities from
  first-pass decoding results
- dem_decomposition: DEM decomposition by detector class, error
  splitting across components, observable assignment, and DEM
  merging for multi-component decoding
Add the multi-pass Tesseract decoder, which decomposes a detector
error model into independent components by detector class and
decodes each component separately across multiple passes. Between
passes, first-pass decoding correlations are used to reweight
error probabilities in subsequent components, improving accuracy.

Key components:
- MultiPassTesseractDecoder: core decoder with static and causal
  scheduling across detector classes
- FastTwoPassTesseractDecoder: optimised two-pass specialisation
- multi_pass_sinter_compat.pybind.h: pybind11 bindings exposing
  MultiPassSinterDecoder and MultiPassSinterCompiledDecoder
- Python integration tests for multi-pass bindings
- Theory and architecture documentation

Performance: 10-100x wall-clock speedup over single-pass Tesseract
by decomposing the DEM into smaller independent components.
@oscarhiggott oscarhiggott requested a review from a team as a code owner May 22, 2026 22:13
@oscarhiggott oscarhiggott requested review from noajshu and removed request for a team May 22, 2026 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant