Add Flat encoding test coverage across all packages#7678
Conversation
| -- | Stable byte encoding tests for flat library container/composite types. | ||
| -- Wrapper types (Identity, All, Any, Dual, etc.) only have roundtrip tests | ||
| -- since their encoding stability is not critical (they are never on-chain). | ||
| testNewEncodings = testGroup "stable byte encodings" |
There was a problem hiding this comment.
We are not introducing any new encodings, so this should just be called "testEncodings"
There was a problem hiding this comment.
Renamed to testEncodingStability. Tell me if you have another idea of a better name.
| -- | Reproducible encoding generator for Flat test coverage. | ||
| -- Run: cabal run flat-encoding-generator | ||
| -- Prints expected byte sequences for all Flat test values. | ||
| executable flat-encoding-generator |
There was a problem hiding this comment.
What's the purpose of this, and in what scenario would someone want to use this executable?
There was a problem hiding this comment.
Replied by changing the code comment:
-- Golden file generator for Flat encoding stability tests.
--
-- The encoding stability tests compare encoded values against expected
-- byte sequences captured as test fixtures. This tool regenerates those
-- fixtures from source, so they don't remain opaque blobs of bits.
--
-- Example: if a bug in a Flat instance is found and the encoding must
-- change in a backward-incompatible way, the stability tests will fail.
-- Run this tool to see the new wire format, update the test fixtures,
-- and confirm the encoding change was intentional.
--
-- See: https://en.wikipedia.org/wiki/Characterization_test
-- Usage: cabal run flat-encoding-generator
There was a problem hiding this comment.
Why is it not sufficient to use the golden test mechanism, where one can use --accept to update the golden files? In this executable you are including a specific set of test cases, which I suppose need to mirror the test cases in the stabilit tests; if we add a new test case, it also needs to be added to the executable?
There was a problem hiding this comment.
We're dealing with binary files here, so existing golden testing mechanism IMO is not the best fit, i.e. there is no point in diffing binary files when showing an error to a user. OTOH, I see your point that with the --accept functionality its easier to re-generate individual failing tests.
There was a problem hiding this comment.
Could you summarize how the current tests work - in particular, the comment above says
The encoding stability tests compare encoded values against expected byte sequences captured as test fixtures
Where are the expected byte sequences stored, if not in a golden file?
There was a problem hiding this comment.
Sure. The byte sequences were hardcoded inline in the test assertions:
encRaw (Nothing :: Maybe Bool) [0]
encRaw (Just True :: Maybe Bool) [192]The generator was a separate executable that printed the same encodings to stdout for manual comparison. But you're right that --accept makes it redundant. I've reworked these into golden tests now. Each stability group produces a text file like:
Nothing :: Maybe Bool = [0]
Just True :: Maybe Bool = [192]
Right () :: Either Bool () = [128]
...
One source of truth, readable diffs, --accept for regeneration. Generator is removed.
b8964e3 to
6cdd889
Compare
0aad87c to
d3cc240
Compare
Pay down test coverage debt for Flat serialization instances. This ensures encoding stability is verified by tests before we attempt to fix the Generic derivation bug in a follow-up PR. Adds roundtrip and stable byte encoding tests for: - Flat library types (Maybe, Either, NonEmpty, Complex, Ratio, Set, Tree, Map, Seq, DList, Filler, PreAligned, and monoid/semigroup wrappers) - TPLC types (Version, Name, Kind, DeBruijn, NamedDeBruijn, SrcSpan, SrcSpans, DefaultFun, DefaultUni, and all newtype wrappers including TyName, Unique, TyDeBruijn, etc.) - PIR types (Recursivity, Strictness) - UPLC types (Binder variants, FakeNamedDeBruijn, minimal Program encoding) - Value types (K encodes as ByteString, Quantity as Integer) Also adds a standalone encoding generator executable (cabal run flat-encoding-generator) for reproducing expected byte constants.
Rename testNewEncodings to testEncodingStability for clarity. Fix plutus-core ^>=1.59 version bound to ^>=1.60. Remove unnecessary -Wno-* flags and unused text dependency from flat-encoding-generator. Improve executable comment with golden testing context and Wikipedia reference.
d3cc240 to
140f473
Compare
The encoding stability tests now use tasty-golden instead of hardcoded inline byte sequences. This addresses review feedback about the generator executable being redundant when --accept can regenerate golden files directly. - Convert testEncodingStability (flat/test) to goldenVsStringDiff - Convert test_flatStaticEncoding (TPLC) to goldenVsStringDiff - Remove flat-encoding-generator executable - Add tasty-golden dependency to flat-test
Both golden files now use encoding-stability.golden.
Context
Approach
Added two categories of tests for Flat instances:
Stable byte encoding tests (a.k.a. golden/characterization tests) pin the exact byte representation of on-chain-critical types (Version, DeBruijn, DefaultFun, DefaultUni, Binder variants, UPLC Program, PIR Recursivity/Strictness). These detect any accidental encoding changes that would break on-chain script compatibility.
Roundtrip property tests verify
unflat (flat x) == Right xfor all Flat instances including library-internal types (monoid/semigroup wrappers, containers, etc.) where encoding stability is less critical but correctness still matters.A standalone encoding generator executable (
cabal run flat-encoding-generator) is included so anyone can reproduce the expected byte constants without GHCi sessions. It regenerates test fixtures from source so they don't remain opaque blobs of bits.BLS12_381 types are deliberately skipped (their encode/decode implementations are
error/fail).Changes
Flat library tests (
flat/test/Spec.hs)TPLC tests (
plutus-core/test/Flat/Spec.hs— new module)UPLC tests (
untyped-plutus-core/testlib/Flat/Spec.hs)(program 1.1.0 (con integer 0))test_flatinto the UPLC test entry point (was previously defined but not imported)PIR tests (
plutus-ir/test/PlutusIR/Core/Tests.hs)Value tests (
plutus-core/test/Value/Spec.hs)Infrastructure
plutus-core/tools/GenerateEncodings.hs)FOURMOLU_DISABLEtoflat/test/Spec.hs(CPP-heavy file that fourmolu cannot parse)