Refined memory/cpu cost models for `ValueData` and `UnValueData` #7500

Unisay · 2025-12-20T10:08:52Z

Context

What: Replace constant memory costs with linear models for ValueData and UnValueData builtins
Why: Existing constant memory costs (1 unit) don't reflect actual memory behavior, leading to inaccurate costing
Approach: Empirical measurement using new memory-analysis tooling to derive accurate linear coefficients

Problem Statement

The current cost models for ValueData and UnValueData use constant memory costs of 1 unit, regardless of input size. This is inaccurate because:

ValueData: Serializes a Value to Data, memory should scale with serialized size
UnValueData: Deserializes Data to Value, memory should scale with node count in the Data structure

Inaccurate memory models can lead to budget misestimation in smart contracts.

Solution Approach

This PR implements a comprehensive solution in 6 logical commits:

Infrastructure: Add memory-analysis executable with plotting and regression utilities
Core Implementation: Introduce DataNodeCount newtype for node-based memory tracking
Type System Integration: Wire DataNodeCount into the DefaultUni type system
Builtin Application: Apply specialized wrappers (ValueTotalSize, DataNodeCount) to builtins
Benchmark Alignment: Update benchmarks to use the new memory wrappers
Cost Model Update: Replace constant costs with empirically-derived linear models

Memory Measurement Strategy

ValueData: Use ValueTotalSize wrapper (already exists) - measures total serialized size
UnValueData: Use new DataNodeCount wrapper - performs lazy node traversal of Data structure

This approach separates concerns:

Memory measurement logic lives in ExMemoryUsage.hs (slope applied per node/byte)
Cost coefficients live in JSON cost models (intercept + slope multiplier)

Design Decisions

Why node count for UnValueData?

UnValueData converts Data → Value by traversing the Data tree structure
Memory cost scales with tree complexity measured as node count
Node count reflects the structural traversal work performed
Lazy traversal (via CostRose) ensures accurate accounting

Why separate wrappers?

Type safety: each wrapper represents a specific memory measurement strategy
Flexibility: can tune coefficients independently via JSON
Clarity: explicit in builtin signatures what memory model is used

Changes

Memory Analysis Tooling

New executable: plutus-benchmark:memory-analysis

PlutusBenchmark.MemoryAnalysis: Main analysis framework
PlutusBenchmark.MemoryAnalysis.Experiments: Memory behavior experiments
PlutusBenchmark.MemoryAnalysis.Generators: Test data generators
PlutusBenchmark.Plotting: Chart generation utilities
PlutusBenchmark.RegressionInteger: Regression with asymmetric loss

This tooling enabled empirical measurement of memory behavior to derive the coefficients used in the cost models.

Core Memory Tracking

plutus-core/src/PlutusCore/Evaluation/Machine/ExMemoryUsage.hs

Added DataNodeCount newtype wrapping Data
Implemented ExMemoryUsage instance using countNodesRoseScaled
Helper function performs lazy node traversal with slope-per-node accounting
Language extensions: AllowAmbiguousTypes, BlockArguments, InstanceSigs, KindSignatures, ScopedTypeVariables

plutus-core/src/PlutusCore/Default/Universe.hs

Added KnownTypeAst instance for DataNodeCount
Added MakeKnownIn and ReadKnownIn instances for marshalling
Minor refactoring: use void instead of (() <$) for clarity

Builtin Updates

plutus-core/src/PlutusCore/Default/Builtins.hs

ValueData: Changed signature from Value -> Data to ValueTotalSize -> Data
UnValueData: Changed signature from Data -> BuiltinResult Value to DataNodeCount -> BuiltinResult Value
Both changes enable accurate memory accounting via specialized wrappers

Benchmark Alignment

plutus-core/cost-model/budgeting-bench/Benchmarks/Values.hs

Updated valueDataBenchmark to use createOneTermBuiltinBenchWithWrapper with ValueTotalSize
Updated unValueDataBenchmark to use createOneTermBuiltinBenchWithWrapper with DataNodeCount
Ensures benchmarks measure the same memory behavior as production builtins

Cost Model Data

plutus-core/cost-model/data/builtinCostModel{A,B,C}.json

Updated memory models (all three variants updated identically):

"valueData": {
  "memory": {
    "arguments": {
      "intercept": 6,
      "slope": 38
    },
    "type": "linear_in_x"
  }
}

Memory = 38 × size + 6 (was constant 1)

"unValueData": {
  "cpu": {
    "arguments": {
      "intercept": 1000,
      "slope": 290658
    },
    "type": "linear_in_x"
  },
  "memory": {
    "arguments": {
      "intercept": 0,
      "slope": 8
    },
    "type": "linear_in_x"
  }
}

Memory = 8 × nodes + 0 (was constant 1)
CPU = 290658 × nodes + 1000 (updated from 43200 × arg + 1000)

plutus-core/cost-model/data/benching-conway.csv

Regenerated benchmark data (404 lines changed) with new memory measurement approach.

Impact

Budget Changes

Scripts using ValueData or UnValueData will see different memory budget consumption:

Small values: Similar or slightly higher memory cost
Large values: Significantly more accurate memory cost (linear scaling)
CPU costs: UnValueData CPU costs updated to reflect node-based measurement

Conformance Tests

Expect budget differences in conformance tests that use these builtins. The new costs are more accurate than the previous constant models.

4. Memory Analysis

The memory-analysis executable can reproduce the experiments:

cabal run plutus-benchmark:memory-analysis

This generates plots and regression analysis in plutus-benchmark/memory-analysis/data/.

5. Conformance Tests

cabal test plutus-conformance

Expect budget differences but correct behavior.

Notes for Reviewers

Commit Structure

The PR is organized as 6 atomic commits following dependency order:

Infrastructure (analysis tools)
Core implementation (DataNodeCount)
Type system integration
Builtin application
Benchmark alignment
Cost model data

Each commit is buildable and represents a logical unit of change.

The updated CPU models for ValueData and UnValueData could be previewed here.

Add new memory-analysis executable with modules for analyzing memory behavior of Plutus builtins. Includes plotting utilities, regression analysis, and experiment framework for deriving accurate memory models from empirical measurements.

Introduce DataNodeCount newtype that measures Data memory via lazy node traversal rather than serialization size. This provides more accurate memory accounting for UnValueData builtin which operates on the Data structure directly without serializing. The wrapper separates concerns: node counting logic here, cost coefficients in JSON models.

Add KnownTypeAst and builtin marshalling instances for DataNodeCount. This enables using the new memory model in builtin definitions while maintaining type safety through the universe system. Also includes minor refactoring (void instead of (() <$)) for clarity.

Apply ValueTotalSize to ValueData and DataNodeCount to UnValueData, replacing plain Value/Data types. This enables accurate memory accounting: ValueData uses total serialized size, UnValueData uses node count for measuring input Data complexity.

Update ValueData and UnValueData benchmarks to use createOneTermBuiltinBenchWithWrapper with appropriate memory measurement wrappers (ValueTotalSize and DataNodeCount). This ensures benchmarks measure the same memory behavior as production builtins.

Replace constant memory costs with linear models derived from empirical measurements: - ValueData: memory = 38×size + 6 (was constant 1) - UnValueData: memory = 8×nodes + 0 (was constant 1) CPU: 290658×nodes + 1000 (was 43200×arg + 1000) The linear models better reflect actual memory behavior: ValueData scales with serialized size, UnValueData scales with node count. Benchmark data regenerated with new memory measurement approach.

ana-pantilie · 2025-12-23T12:33:27Z

plutus-benchmark/memory-analysis/src/PlutusBenchmark/RegressionInteger.hs

@@ -0,0 +1,121 @@
+module PlutusBenchmark.RegressionInteger (integerBestFit) where


I don't think we should implement our own linear regression algorithm when R is the domain standard for this.

Unisay force-pushed the yura/value-data-memory-models branch 2 times, most recently from 1bd6254 to 43bbe0a Compare December 22, 2025 15:30

Unisay added 3 commits December 22, 2025 16:33

Unisay force-pushed the yura/value-data-memory-models branch from 43bbe0a to 073fe97 Compare December 22, 2025 15:34

Unisay self-assigned this Dec 22, 2025

Unisay added 3 commits December 22, 2025 17:54

Unisay force-pushed the yura/value-data-memory-models branch from 073fe97 to 026e835 Compare December 22, 2025 17:04

Unisay marked this pull request as ready for review December 22, 2025 17:27

Unisay requested review from ana-pantilie and kwxm December 22, 2025 17:31

ana-pantilie reviewed Dec 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refined memory/cpu cost models for `ValueData` and `UnValueData` #7500

Refined memory/cpu cost models for `ValueData` and `UnValueData` #7500

Unisay commented Dec 20, 2025 •

edited

Loading

Uh oh!

ana-pantilie Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,121 @@
		module PlutusBenchmark.RegressionInteger (integerBestFit) where

Refined memory/cpu cost models for ValueData and UnValueData #7500

Are you sure you want to change the base?

Refined memory/cpu cost models for ValueData and UnValueData #7500

Conversation

Unisay commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Problem Statement

Solution Approach

Memory Measurement Strategy

Design Decisions

Changes

Memory Analysis Tooling

Core Memory Tracking

Builtin Updates

Benchmark Alignment

Cost Model Data

Impact

Budget Changes

Conformance Tests

4. Memory Analysis

5. Conformance Tests

Notes for Reviewers

Commit Structure

Uh oh!

ana-pantilie Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Refined memory/cpu cost models for `ValueData` and `UnValueData` #7500

Refined memory/cpu cost models for `ValueData` and `UnValueData` #7500

Unisay commented Dec 20, 2025 •

edited

Loading