Skip to content

Conversation

@Unisay
Copy link
Contributor

@Unisay Unisay commented Dec 12, 2025

Context

  • What: Add Array data constructor and associated builtins (ArrayData, UnArrayData) to Plutus Core
  • Why: Enable efficient representation of homogeneous sequential data in smart contracts, improving memory efficiency and performance for array-heavy operations compared to linked lists
  • Issue: Closes #1666

Approach

This PR implements a new Array constructor for the Data type using Vector for efficient array representation. The implementation follows the established pattern for data constructors and builtins in Plutus Core.

Key Technical Decisions:

  1. CBOR Encoding: Uses CBOR tag 128 for Array encoding, following CBOR specification for alternative array representations. This provides a compact encoding distinct from the existing List constructor.

  2. ChooseData Update: Extended ChooseData from 6 to 7 arguments to handle the new Array branch. This is a breaking change for existing code using ChooseData, but maintains backwards compatibility at the protocol level through proper versioning.

  3. Cost Models: Derived new costing functions from benchmark data for ArrayData, UnArrayData, and the updated ChooseData. All three cost model variants (A, B, C) are updated identically to maintain consistency across evaluation contexts.

  4. Formal Verification: Extended Agda metatheory to include Array constructor and builtins, ensuring the implementation maintains formal correspondence between specification and execution semantics.

The implementation touches all layers of the Plutus stack: core data representation, builtin machinery, compiler support, ledger API integration, and formal verification.

Changes

Specification

  • Updated doc/plutus-core-spec/cardano/builtins1.tex with Array type definition, injections/projections, concrete syntax, updated chooseData, and new builtins
  • Updated doc/plutus-core-spec/data-cbor.tex with CBOR encoding/decoding specification for Array (tag 128)

Core Data Type

  • Added Array (Vector Data) constructor to PlutusCore.Data
  • Implemented CBOR encoding/decoding with tag 128
  • Added Data.Vector.Orphans for required instances
  • Updated CBOR stability tests

Builtins

  • Added ArrayData (tag 101) builtin for constructing Array from Vector
  • Added UnArrayData (tag 102) builtin for deconstructing Array to Vector
  • Updated ChooseData to 7-argument form with Array branch
  • Added proper error handling for UnArrayData on non-Array data

Cost Models

  • Added benchmarking infrastructure for array operations
  • Generated cost models from benchmark data for new builtins
  • Updated BuiltinCostModel, CostingFun, and ExBudgetingDefaults
  • Updated all three cost model JSON files (A, B, C) with identical parameters
  • Modified R models for cost model generation

PlutusTx Support

  • Added compiler support for array builtins
  • Updated IsData instances to handle Array constructor
  • Modified AssocMap to work with 7-argument ChooseData
  • Added test coverage for array operations

Ledger API

  • Exposed array builtins in evaluation contexts for all Plutus versions (V1, V2, V3)
  • Added plutus-core dependency to analyse-script-events for array support
  • Updated all evaluation context test suites

Metatheory

  • Extended Agda formalization with Array constructor
  • Updated CEK machine semantics in both Algorithmic and Untyped modules
  • Added array builtins to builtin signatures
  • Regenerated all MAlonzo Haskell code from Agda

PlutusIR

  • Updated rewrite rules to handle 7-argument ChooseData
  • Ensures optimization passes correctly transform array-related patterns

Testing

  • Added conformance tests for ArrayData, UnArrayData, and ChooseData with Array
  • Updated all golden tests affected by the changes (budget files, UPLC/PIR output, type synthesis)
  • Regenerated golden files across plutus-core, plutus-tx-plugin, plutus-benchmark, and cardano-constitution
  • Updated test generators to include array constructor variants

Breaking Changes

ChooseData Signature Change: The ChooseData builtin now requires 7 arguments instead of 6 (added Array branch). Existing scripts using ChooseData will need to be recompiled, but this change is isolated to the builtin signature and does not affect on-chain compatibility due to proper versioning.

Test Plan

  • All conformance tests pass (including new array builtin tests)
  • Golden tests updated and verified
  • Cost model generation validated
  • Metatheory type-checks with Agda
  • Plutus Tx plugin tests pass
  • Ledger API evaluation context tests pass

Add Array as a new Data constructor using Vector for efficient
array representation. This enables compact encoding of homogeneous
data sequences and improves performance for array-heavy smart contracts.

The Array constructor uses CBOR tag 128 for encoding, following
the CBOR specification for alternative array representations.
This provides better memory efficiency compared to linked lists
for large sequential data structures.
Implement ArrayData (tag 101) and UnArrayData (tag 102) builtins
for constructing and deconstructing Array data. These builtins
enable efficient conversion between Vector and Data representation.

Update ChooseData to handle the new Array constructor by adding
a seventh branch parameter, maintaining backwards compatibility
with proper error handling for non-Array data in UnArrayData.
Add costing functions and models for ArrayData, UnArrayData,
and updated ChooseData (now 7-argument). Cost models derived
from benchmark data provide accurate execution cost estimates
for array operations.

Update all three cost model variants (A, B, C) with identical
parameters for consistency across evaluation contexts.
Add PlutusTx compiler support for array builtins and update
IsData instances to handle Array constructor. This enables
Haskell smart contract code to use array data representations
with automatic compilation to Plutus Core.

Update AssocMap to work with the new 7-argument ChooseData.
Update evaluation contexts across all Plutus versions (V1, V2, V3)
to include the new array builtins. This makes ArrayData and
UnArrayData available to on-chain validators and scripts.

Add plutus-core dependency to analyse-script-events for
array support in script analysis tooling.
Extend Agda metatheory with Array constructor and array builtins.
Update CEK machine semantics, builtin signatures, and all
generated Haskell code from Agda to maintain formal correspondence
between specification and implementation.

This ensures the array feature is fully formalized and verified
in the Plutus metatheory.
Add comprehensive conformance tests for ArrayData, UnArrayData,
and updated ChooseData builtin. Tests cover success cases,
error handling, and budget validation to ensure consistent
behavior across Plutus implementations.

Tests include array construction, deconstruction, and proper
error messages when UnArrayData receives non-Array data.
Regenerate all golden test files affected by the array feature:
- Budget changes from ChooseData parameter addition
- UPLC and PIR output with array constructors
- Type synthesis tests for new builtins
- Evaluation trace changes from updated cost models

Golden files updated include test suites, benchmarks, examples,
and stdlib tests across plutus-core, plutus-tx-plugin,
plutus-benchmark, and cardano-constitution packages.
Update PlutusIR rewrite rules to handle the new 7-argument
ChooseData builtin. Ensures optimization passes correctly
transform array-related code patterns.
@Unisay Unisay self-assigned this Dec 12, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 12, 2025

Execution Budget Golden Diff

14c309a (master) vs addb6f8

output

plutus-benchmark/cardano-loans/test/9.6/main.golden.eval

Metric Old New Δ%
CPU 111_100_889 108_714_363 -2.15%
Memory 622_350 643_450 +3.39%
Flat Size 8_666 8_780 +1.32%

plutus-benchmark/linear-vesting/test/9.6/main.golden.eval

Metric Old New Δ%
CPU 30_837_131 30_794_293 -0.14%
Memory 131_619 131_819 +0.15%
Flat Size 2_860 2_862 +0.07%

plutus-benchmark/lists/test/Sum/9.6/left-fold-data.golden.eval

Metric Old New Δ%
CPU 165_524_496 182_615_110 +10.33%
Memory 778_865 799_065 +2.59%
Flat Size 701 707 +0.86%

plutus-benchmark/lists/test/Sum/9.6/right-fold-data.golden.eval

Metric Old New Δ%
CPU 170_324_496 187_415_110 +10.03%
Memory 808_865 829_065 +2.50%
Flat Size 704 710 +0.85%

plutus-benchmark/script-contexts/test/V1/9.6/checkScriptContextEqualityData-20.golden.eval

Metric Old New Δ%
CPU 34_586_709 33_661_145 -2.68%

plutus-benchmark/script-contexts/test/V1/Data/9.6/checkScriptContextEqualityData-20.golden.eval

Metric Old New Δ%
CPU 29_994_709 29_069_145 -3.09%

plutus-benchmark/script-contexts/test/V2/9.6/checkScriptContextEqualityData-20.golden.eval

Metric Old New Δ%
CPU 37_339_261 36_321_913 -2.72%

plutus-benchmark/script-contexts/test/V2/9.6/dataFwdStakeTrick.golden.eval

Metric Old New Δ%
CPU 4_914_248 5_242_990 +6.69%

plutus-benchmark/script-contexts/test/V2/9.6/dataFwdStakeTrickManual.golden.eval

Metric Old New Δ%
CPU 5_058_248 5_386_990 +6.50%

plutus-benchmark/script-contexts/test/V2/Data/9.6/checkScriptContextEqualityData-20.golden.eval

Metric Old New Δ%
CPU 32_395_261 31_377_913 -3.14%

plutus-benchmark/script-contexts/test/V3/9.6/checkScriptContextEqualityData-20.golden.eval

Metric Old New Δ%
CPU 37_926_283 36_890_161 -2.73%

plutus-benchmark/script-contexts/test/V3/Data/9.6/checkScriptContextEqualityData-20.golden.eval

Metric Old New Δ%
CPU 32_886_283 31_850_161 -3.15%

plutus-benchmark/script-contexts/test/V3/Data/9.6/purposeIsWellFormed-4.golden.eval

Metric Old New Δ%
CPU 19_611_399 20_306_431 +3.54%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/currencySymbolValueOf.golden.eval

Metric Old New Δ%
CPU 11_438_876 12_481_424 +9.11%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/geq1.golden.eval

Metric Old New Δ%
CPU 339_163_895 365_227_595 +7.68%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/geq2.golden.eval

Metric Old New Δ%
CPU 356_777_594 383_710_084 +7.55%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/geq3.golden.eval

Metric Old New Δ%
CPU 370_496_887 398_298_167 +7.50%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/geq4.golden.eval

Metric Old New Δ%
CPU 332_716_553 359_301_527 +7.99%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/geq5.golden.eval

Metric Old New Δ%
CPU 351_276_383 378_382_631 +7.72%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/gt1.golden.eval

Metric Old New Δ%
CPU 390_276_300 419_815_160 +7.57%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/gt2.golden.eval

Metric Old New Δ%
CPU 357_145_594 384_078_084 +7.54%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/gt3.golden.eval

Metric Old New Δ%
CPU 422_330_992 453_607_432 +7.41%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/gt4.golden.eval

Metric Old New Δ%
CPU 333_084_553 359_669_527 +7.98%

plutus-ledger-api/test-plugin/Spec/Data/Budget/9.6/gt5.golden.eval

Metric Old New Δ%
CPU 375_739_011 404_409_081 +7.63%

plutus-tx-plugin/test/Budget/9.6/map1.golden.eval

Metric Old New Δ%
CPU 155_773_381 171_932_875 +10.37%

plutus-tx-plugin/test/Budget/9.6/map2.golden.eval

Metric Old New Δ%
CPU 67_859_382 73_419_638 +8.19%

plutus-tx-plugin/test/Budget/9.6/map3.golden.eval

Metric Old New Δ%
CPU 112_163_732 120_851_632 +7.75%

plutus-tx-plugin/test/CallTrace/9.6/successfullEvaluationYieldsNoTraceLog.golden.eval

Metric Old New Δ%
Flat Size 509 343 -32.61%

plutus-tx-plugin/test/DataList/Budget/9.6/elem.golden.eval

Metric Old New Δ%
CPU 10_484_444 11_874_508 +13.26%

This comment will get updated when changes are made.

@Unisay Unisay changed the title feat(core): add Array data constructor and builtins Array data constructor and builtins Dec 12, 2025
Update LaTeX specification documents to document the new Array constructor:

builtins1.tex:
- Add Array (Vector Data) to Data type definition
- Update set-theoretic denotation with Array component
- Add injection (inj_A) and projection (proj_A) functions
- Add is_A to boolean test functions
- Add (Array ...) to concrete syntax
- Update chooseData from 6 to 7 arguments with t_A branch
- Add arrayData and unArrayData builtins to table

data-cbor.tex:
- Add Array to Data type definition
- Add encoder: Array uses CBOR tag 128 + indefinite list
- Add decoder case for tag 128
@Unisay Unisay force-pushed the yura/issue-1666-data-array-core branch from 80756e1 to 470618e Compare December 12, 2025 14:40
Update ChooseData benchmark to account for the new Array data constructor. The benchmark now uses mkApp7 with six result arguments instead of mkApp5 with five, and includes comments clarifying which argument corresponds to which Data constructor case.
Update CPU cost parameters for chooseData, equalsData, and serialiseData builtins to reflect accurate benchmarking with the new Array data constructor. Notable changes include chooseData cost reduction (now handles 7 cases efficiently) and adjusted slopes for data comparison operations.
Update all golden test files to reflect the new budget calculations from the updated cost model parameters. Changes span conformance tests, benchmark tests, and plugin tests across V1, V2, and V3 script contexts.
Reorder Array constructor from position 4 to position 6 in the Data type definition and all pattern matches. This makes Array the final constructor, matching its role as an extension to the original 5-constructor Data type (Constr, Map, List, I, B).
Create new Batch 7 section documenting the unreleased Array constructor for the data type and its associated arrayData/unArrayData functions. Move these functions from Batch 1 to Batch 7 to accurately reflect that they will be released in a future protocol version along with the Array constructor itself. Add explanatory note in Batch 1 clarifying Array's future availability.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants