Skip to content

Fragmentation in SparseBytes caused by replaceAtZ rule #742

@bbyalcinkaya

Description

@bbyalcinkaya

Background

SparseBytes is the data structure used for memory representation. Instead of storing a full byte array, memory is kept as a list of chunks: empty regions (#empty(N)) and concrete byte blocks (#bytes(Bs)). The memory always starts as a single #empty(N) chunk, where N is the size of the initial memory. When writes occur, empty regions are split and #bytes chunks are inserted in their place. This structure keeps large mostly-empty memories efficient, but relies on chunks remaining as coarse-grained as possible.

SparseBytes was introduced as an optimization for the MultiverseX semantics and Kasmer, where contracts often had very large memories with huge untouched regions. This representation significantly reduced configuration size and made symbolic execution feasible on those workloads.

Description

The following replaceAtZ rule in the SparseBytes implementation introduces unnecessary fragmentation.

///////// len(Bs) = N
rule replaceAtZ(N, REST, S, Bs)
=> SBChunk(#bytes(Bs)) REST
requires 0 ==Int S
andBool lengthBytes(Bs) ==Int N

When writing to the end of an #empty chunk, the rule always creates a new #bytes chunk, even if the next chunk in REST is already a #bytes chunk. This leads to an explosion of adjacent small chunks under workloads with many small writes (e.g., copying an array from host to memory). This rule blindly inserts a new SBChunk(#bytes(Bs)) in front of REST. If REST begins with another #bytes chunk, the two byte blocks should logically be adjacent, but the rule keeps them separate. Over time this produces significant fragmentation in the SparseBytes list.

Proposed fix

Before creating a new chunk, check whether REST begins with a #bytes chunk. If it does, merge Bs with that chunk instead of prepending a new one. This keeps consecutive byte regions contiguous and prevents unnecessary list growth.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions