Skip to content

Disallow empty root struct#348

Merged
tigrannajaryan merged 1 commit intomainfrom
tigran/disallow-empty-structs
Feb 4, 2026
Merged

Disallow empty root struct#348
tigrannajaryan merged 1 commit intomainfrom
tigran/disallow-empty-structs

Conversation

@tigrannajaryan
Copy link
Collaborator

Resolves #341

Emptry root struct cannot work since decoding it is a noop and never reaches the end of column, thus potentially allowing a very large number of such empty records to be "decoded" by Reader without making any progress. This makes protecting from invalid inputs hard.

We now disallow this, since there is no plausible use case for a schema like this.

We still allow empty non-root structs, which is a valid way to have a schema that can evolve by adding fields to empty struct. It does not suffer from the same problem since Reader will have to make some progress for non-empty root struct and will eventually reach end of column for root struct.

We also disallow this in stefc schema parser.

@github-actions
Copy link

github-actions bot commented Jan 26, 2026

Benchmark Result

Benchmark diff with base branch
goos: linux
goarch: amd64
pkg: github.com/splunk/stef/benchmarks
cpu: AMD EPYC 7763 64-Core Processor                
                                                 │ bench-main.txt │           bench-new.txt            │
                                                 │     sec/op     │    sec/op     vs base              │
SerializeNative/STEF/serialize-4                     8.902m ± 18%   9.696m ±  7%       ~ (p=0.180 n=6)
SerializeNative/STEFU/serialize-4                    35.61m ±  1%   35.06m ±  3%  -1.53% (p=0.004 n=6)
DeserializeNative/STEF/deser-4                       2.483m ±  2%   2.434m ±  1%  -1.97% (p=0.009 n=6)
DeserializeNative/STEFU/deser-4                      7.277m ±  0%   7.201m ±  1%  -1.04% (p=0.002 n=6)
SerializeFromPdata/STEF/serialize-4                  137.2m ±  2%   138.4m ±  2%       ~ (p=0.240 n=6)
SerializeFromPdata/STEFU/serialize-4                 35.54m ±  2%   34.36m ±  1%  -3.31% (p=0.002 n=6)
DeserializeToPdata/STEF/deserialize-4                47.11m ±  3%   45.78m ±  0%  -2.80% (p=0.009 n=6)
DeserializeToPdata/STEFU/deserialize-4               63.41m ±  2%   63.71m ±  1%       ~ (p=0.818 n=6)
STEFReaderRead-4                                     2.523m ±  2%   2.518m ±  1%       ~ (p=0.310 n=6)
STEFSerializeMultipart/astronomy-otelmetrics-4        3.308 ± 23%    3.274 ± 26%       ~ (p=0.818 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4     71.16m ± 15%   72.99m ±  7%       ~ (p=0.699 n=6)
ReadSTEF-4                                           2.613m ±  2%   2.555m ±  5%       ~ (p=0.065 n=6)
ReadSTEFZ-4                                          3.330m ±  1%   3.175m ±  1%  -4.66% (p=0.002 n=6)
ReadSTEFZWriteSTEF-4                                 7.649m ±  2%   7.180m ±  1%  -6.13% (p=0.002 n=6)
geomean                                              21.02m         20.83m        -0.92%

                                                 │ bench-main.txt │           bench-new.txt            │
                                                 │   sec/point    │  sec/point    vs base              │
SerializeNative/STEF/serialize-4                     133.1n ± 18%   145.0n ±  7%       ~ (p=0.180 n=6)
SerializeNative/STEFU/serialize-4                    532.6n ±  1%   524.5n ±  3%  -1.52% (p=0.004 n=6)
DeserializeNative/STEF/deser-4                       37.13n ±  2%   36.41n ±  1%  -1.97% (p=0.009 n=6)
DeserializeNative/STEFU/deser-4                      108.9n ±  1%   107.7n ±  1%  -1.01% (p=0.002 n=6)
SerializeFromPdata/STEF/serialize-4                  2.052µ ±  2%   2.070µ ±  2%       ~ (p=0.240 n=6)
SerializeFromPdata/STEFU/serialize-4                 531.6n ±  2%   514.0n ±  1%  -3.31% (p=0.002 n=6)
DeserializeToPdata/STEF/deserialize-4                704.7n ±  3%   684.9n ±  0%  -2.81% (p=0.009 n=6)
DeserializeToPdata/STEFU/deserialize-4               948.6n ±  2%   953.0n ±  1%       ~ (p=0.818 n=6)
STEFReaderRead-4                                     37.73n ±  2%   37.67n ±  1%       ~ (p=0.327 n=6)
STEFSerializeMultipart/astronomy-otelmetrics-4       4.205µ ± 23%   4.161µ ± 26%       ~ (p=0.818 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4     90.44n ± 15%   92.78n ±  7%       ~ (p=0.699 n=6)
ReadSTEF-4                                           39.10n ±  2%   38.24n ±  5%       ~ (p=0.065 n=6)
ReadSTEFZ-4                                          49.84n ±  1%   47.52n ±  1%  -4.65% (p=0.002 n=6)
ReadSTEFZWriteSTEF-4                                 114.5n ±  2%   107.4n ±  1%  -6.12% (p=0.002 n=6)
geomean                                              221.1n         219.1n        -0.91%

                                                 │ bench-main.txt │            bench-new.txt             │
                                                 │      B/op      │     B/op      vs base                │
SerializeNative/STEF/serialize-4                     3.334Mi ± 0%   3.335Mi ± 0%       ~ (p=0.394 n=6)
SerializeNative/STEFU/serialize-4                    7.557Mi ± 0%   7.557Mi ± 0%       ~ (p=0.364 n=6)
DeserializeNative/STEF/deser-4                       951.4Ki ± 0%   951.4Ki ± 0%       ~ (p=1.000 n=6) ¹
DeserializeNative/STEFU/deser-4                      1.715Mi ± 0%   1.715Mi ± 0%       ~ (p=1.000 n=6)
SerializeFromPdata/STEF/serialize-4                  76.55Mi ± 0%   76.56Mi ± 0%       ~ (p=0.169 n=6)
SerializeFromPdata/STEFU/serialize-4                 7.557Mi ± 0%   7.557Mi ± 0%       ~ (p=0.723 n=6)
DeserializeToPdata/STEF/deserialize-4                31.99Mi ± 0%   31.99Mi ± 0%       ~ (p=0.983 n=6)
DeserializeToPdata/STEFU/deserialize-4               38.88Mi ± 0%   38.88Mi ± 0%  -0.00% (p=0.013 n=6)
STEFReaderRead-4                                     953.1Ki ± 0%   953.1Ki ± 0%       ~ (p=1.000 n=6)
STEFSerializeMultipart/astronomy-otelmetrics-4       3.384Gi ± 0%   3.384Gi ± 0%       ~ (p=0.937 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4     20.30Mi ± 0%   20.30Mi ± 0%       ~ (p=0.353 n=6)
ReadSTEF-4                                           953.2Ki ± 0%   953.2Ki ± 0%       ~ (p=0.370 n=6)
ReadSTEFZ-4                                          10.29Mi ± 0%   10.29Mi ± 0%  +0.00% (p=0.026 n=6)
ReadSTEFZWriteSTEF-4                                 13.44Mi ± 0%   13.44Mi ± 0%       ~ (p=1.000 n=6)
geomean                                              10.66Mi        10.66Mi       +0.00%
¹ all samples are equal

                                                 │ bench-main.txt │            bench-new.txt            │
                                                 │   allocs/op    │  allocs/op   vs base                │
SerializeNative/STEF/serialize-4                      2.646k ± 1%   2.647k ± 0%       ~ (p=0.448 n=6)
SerializeNative/STEFU/serialize-4                      884.5 ± 0%    885.0 ± 0%       ~ (p=1.000 n=6)
DeserializeNative/STEF/deser-4                         463.0 ± 0%    463.0 ± 0%       ~ (p=1.000 n=6) ¹
DeserializeNative/STEFU/deser-4                        496.0 ± 0%    496.0 ± 0%       ~ (p=1.000 n=6) ¹
SerializeFromPdata/STEF/serialize-4                   134.7k ± 0%   134.7k ± 0%       ~ (p=0.134 n=6)
SerializeFromPdata/STEFU/serialize-4                   886.0 ± 0%    886.0 ± 0%       ~ (p=1.000 n=6)
DeserializeToPdata/STEF/deserialize-4                 756.2k ± 0%   756.2k ± 0%       ~ (p=1.000 n=6) ¹
DeserializeToPdata/STEFU/deserialize-4                944.9k ± 0%   944.9k ± 0%       ~ (p=1.000 n=6) ¹
STEFReaderRead-4                                       463.0 ± 0%    463.0 ± 0%       ~ (p=1.000 n=6) ¹
STEFSerializeMultipart/astronomy-otelmetrics-4        13.15M ± 0%   13.15M ± 0%       ~ (p=0.589 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4      1.956k ± 0%   1.956k ± 0%       ~ (p=1.000 n=6)
ReadSTEF-4                                             464.0 ± 0%    464.0 ± 0%       ~ (p=1.000 n=6) ¹
ReadSTEFZ-4                                            502.0 ± 0%    501.5 ± 0%       ~ (p=0.182 n=6)
ReadSTEFZWriteSTEF-4                                  1.231k ± 0%   1.231k ± 0%       ~ (p=1.000 n=6)
geomean                                               6.304k        6.304k       +0.00%
¹ all samples are equal
Benchmark result
benchstat bench-new.txt
goos: linux
goarch: amd64
pkg: github.com/splunk/stef/benchmarks
cpu: AMD EPYC 7763 64-Core Processor                
                                                 │ bench-new.txt │
                                                 │    sec/op     │
SerializeNative/STEF/serialize-4                    9.696m ±  7%
SerializeNative/STEFU/serialize-4                   35.06m ±  3%
DeserializeNative/STEF/deser-4                      2.434m ±  1%
DeserializeNative/STEFU/deser-4                     7.201m ±  1%
SerializeFromPdata/STEF/serialize-4                 138.4m ±  2%
SerializeFromPdata/STEFU/serialize-4                34.36m ±  1%
DeserializeToPdata/STEF/deserialize-4               45.78m ±  0%
DeserializeToPdata/STEFU/deserialize-4              63.71m ±  1%
STEFReaderRead-4                                    2.518m ±  1%
STEFSerializeMultipart/astronomy-otelmetrics-4       3.274 ± 26%
STEFDeserializeMultipart/astronomy-otelmetrics-4    72.99m ±  7%
ReadSTEF-4                                          2.555m ±  5%
ReadSTEFZ-4                                         3.175m ±  1%
ReadSTEFZWriteSTEF-4                                7.180m ±  1%
geomean                                             20.83m

                                                 │ bench-new.txt │
                                                 │   sec/point   │
SerializeNative/STEF/serialize-4                    145.0n ±  7%
SerializeNative/STEFU/serialize-4                   524.5n ±  3%
DeserializeNative/STEF/deser-4                      36.41n ±  1%
DeserializeNative/STEFU/deser-4                     107.7n ±  1%
SerializeFromPdata/STEF/serialize-4                 2.070µ ±  2%
SerializeFromPdata/STEFU/serialize-4                514.0n ±  1%
DeserializeToPdata/STEF/deserialize-4               684.9n ±  0%
DeserializeToPdata/STEFU/deserialize-4              953.0n ±  1%
STEFReaderRead-4                                    37.67n ±  1%
STEFSerializeMultipart/astronomy-otelmetrics-4      4.161µ ± 26%
STEFDeserializeMultipart/astronomy-otelmetrics-4    92.78n ±  7%
ReadSTEF-4                                          38.24n ±  5%
ReadSTEFZ-4                                         47.52n ±  1%
ReadSTEFZWriteSTEF-4                                107.4n ±  1%
geomean                                             219.1n

                                                 │ bench-new.txt │
                                                 │     B/op      │
SerializeNative/STEF/serialize-4                    3.335Mi ± 0%
SerializeNative/STEFU/serialize-4                   7.557Mi ± 0%
DeserializeNative/STEF/deser-4                      951.4Ki ± 0%
DeserializeNative/STEFU/deser-4                     1.715Mi ± 0%
SerializeFromPdata/STEF/serialize-4                 76.56Mi ± 0%
SerializeFromPdata/STEFU/serialize-4                7.557Mi ± 0%
DeserializeToPdata/STEF/deserialize-4               31.99Mi ± 0%
DeserializeToPdata/STEFU/deserialize-4              38.88Mi ± 0%
STEFReaderRead-4                                    953.1Ki ± 0%
STEFSerializeMultipart/astronomy-otelmetrics-4      3.384Gi ± 0%
STEFDeserializeMultipart/astronomy-otelmetrics-4    20.30Mi ± 0%
ReadSTEF-4                                          953.2Ki ± 0%
ReadSTEFZ-4                                         10.29Mi ± 0%
ReadSTEFZWriteSTEF-4                                13.44Mi ± 0%
geomean                                             10.66Mi

                                                 │ bench-new.txt │
                                                 │   allocs/op   │
SerializeNative/STEF/serialize-4                     2.647k ± 0%
SerializeNative/STEFU/serialize-4                     885.0 ± 0%
DeserializeNative/STEF/deser-4                        463.0 ± 0%
DeserializeNative/STEFU/deser-4                       496.0 ± 0%
SerializeFromPdata/STEF/serialize-4                  134.7k ± 0%
SerializeFromPdata/STEFU/serialize-4                  886.0 ± 0%
DeserializeToPdata/STEF/deserialize-4                756.2k ± 0%
DeserializeToPdata/STEFU/deserialize-4               944.9k ± 0%
STEFReaderRead-4                                      463.0 ± 0%
STEFSerializeMultipart/astronomy-otelmetrics-4       13.15M ± 0%
STEFDeserializeMultipart/astronomy-otelmetrics-4     1.956k ± 0%
ReadSTEF-4                                            464.0 ± 0%
ReadSTEFZ-4                                           501.5 ± 0%
ReadSTEFZWriteSTEF-4                                 1.231k ± 0%
geomean                                              6.304k

@tigrannajaryan tigrannajaryan marked this pull request as ready for review January 26, 2026 20:11
@tigrannajaryan
Copy link
Collaborator Author

@abhisheksplunk this is stacked on top of #340
Please review that one, that needs to be merged first.

@tigrannajaryan tigrannajaryan force-pushed the tigran/limit-field-count branch from 0603547 to 4a0ed09 Compare February 4, 2026 20:32
Base automatically changed from tigran/limit-field-count to main February 4, 2026 20:40
@tigrannajaryan tigrannajaryan force-pushed the tigran/disallow-empty-structs branch from 842c4f1 to 74ddef6 Compare February 4, 2026 20:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a security and correctness issue by disallowing empty root structs in the STEF schema. Empty root structs cause decoders to enter a no-progress state where they can decode an unbounded number of empty records without consuming input data, making it difficult to protect against malformed inputs.

Changes:

  • Added compile-time validation in the schema parser to reject empty root structs with a clear error message
  • Added runtime validation in generated decoders to return an error when attempting to decode data with zero fields in the root struct
  • Updated schema shrinking utilities to preserve at least one field in root structs during testing

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
go/pkg/idl/parser.go Adds parser validation to reject empty root struct definitions
go/pkg/idl/parser_test.go Adds test case for empty root struct validation and updates existing tests to use non-empty root structs
stefc/templates/go/struct.go.tmpl Adds runtime check in generated decoder Init method to reject empty root structs
go/pkg/errors.go Defines new error constant ErrEmptyRootStructDisallowed
go/otel/otelstef/spans.go Generated code with runtime validation for Spans decoder
go/otel/otelstef/metrics.go Generated code with runtime validation for Metrics decoder
examples/profile/internal/profile/sample.go Generated code with runtime validation for Sample decoder
examples/jsonl/internal/jsonstef/record.go Generated code with runtime validation for Record decoder
examples/ints/internal/ints/record.go Generated code with runtime validation for Record decoder
go/pkg/schema/utils.go Updates schema shrinking logic to prevent removing the last field from root structs
go/otel/otelstef/testdata/fuzz/FuzzSpansReader/3377d71d4f33a55f Adds fuzz test corpus that triggers the empty root struct error

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Resolves #341

Emptry root struct cannot work since decoding it is a noop
and never reaches the end of column, thus potentially allowing
a very large number of such empty records to be "decoded" by
Reader without making any progress. This makes protecting from
invalid inputs hard.

We now disallow this, since there is no plausible use case
for a schema like this.

We still allow empty non-root structs, which is a valid way to
have a schema that can evolve by adding fields to empty struct.
It does not suffer from the same problem since Reader will have
to make some progress for non-empty root struct and will eventually
reach end of column for root struct.

We also disallow this in stefc schema parser.
@tigrannajaryan tigrannajaryan force-pushed the tigran/disallow-empty-structs branch from 74ddef6 to 556f6e3 Compare February 4, 2026 20:59
@tigrannajaryan tigrannajaryan merged commit 46ef217 into main Feb 4, 2026
9 checks passed
@tigrannajaryan tigrannajaryan deleted the tigran/disallow-empty-structs branch February 4, 2026 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Disallow empty root struct

2 participants