Data: Add TCK tests for Metadata Columns in BaseFormatModelTests by Guosmilesmile · Pull Request #15675 · apache/iceberg

Guosmilesmile · 2026-03-18T15:17:48Z

This pr add TCK tests for metadata column reading in BaseFormatModelTests.

Metadata Colums:

FILE_PATH
SPEC_ID
ROW_POSITION
IS_DELETED
Lineage
- ROW_ID
- LAST_UPDATED_SEQUENCE_NUMBER
PARTITION_COLUMN
- Transformations
- Partition evolution (adding and removing columns)

pvary · 2026-03-23T14:56:48Z

          new String[] {FEATURE_FILTER, FEATURE_CASE_SENSITIVE, FEATURE_SPLIT},
          FileFormat.ORC,
-          new String[] {FEATURE_REUSE_CONTAINERS});
+          new String[] {FEATURE_REUSE_CONTAINERS, FEATURE_META_ROW_LINEAGE});


How hard would it be to implement this?

I think it should work. I'll give it a try in the next PR.

Hi Peter, the corresponding PR has been submitted. #15776

pvary · 2026-03-24T17:06:36Z

+
+  @ParameterizedTest
+  @FieldSource("FILE_FORMATS")
+  void testReadMetadataColumnPartitionEvolutionAddColumn(FileFormat fileFormat) throws IOException {


Could we have a test with addColumnWithDefaultReadValue?

Add testReaderSchemaEvolutionNewColumnWithDefault , and found orc don't support it.

iceberg/orc/src/main/java/org/apache/iceberg/orc/ORCSchemaUtil.java

Lines 407 to 413 in 5dff6f6

if (field.initialDefault() != null) {

throw new UnsupportedOperationException(

String.format(

"ORC cannot read default value for field %s (%s): %s",

root.findColumnName(fieldId), type, field.initialDefault()));

}

Either create PR for it and we can start working on that, or create an issue and link it here.

pvary · 2026-04-15T12:17:37Z

+  @FieldSource("FILE_FORMATS")
+  void testReaderSchemaEvolutionNewColumnWithDefault(FileFormat fileFormat) throws IOException {
+
+    assumeSupports(fileFormat, FEATURE_READER_DEFAULT);


Do we have a PR which adds this to the ORC reader?

Currently none. That PR should be about supporting bloodlines in Orcs.

supporting bloodlines in Orcs

This is a funny transaltion 😅

Lineage.. My bad.

Guosmilesmile · 2026-04-16T07:55:45Z

mark https://github.com/apache/iceberg/actions/runs/24493984088/job/71584757252?pr=15675 flink run cancelled after 60m.

pvary · 2026-04-16T12:07:15Z

+                })
+            .toList();
+
+    readAndAssertGenericRecords(fileFormat, evolvedSchema, expectedGenericRecords);


Maybe we could save on the conversion as well. What if we just do this:

readAndAssertGenericRecords(fileFormat, evolvedSchema, record -> { Record expected = GenericRecord.create(evolvedSchema); for (Types.NestedField col : writeSchema.columns()) { expected.setField(col.name(), record.getField(col.name())); } expected.setField("col_f", defaultStringValue); expected.setField("col_g", defaultIntValue); return expected; });

We also need to pass genericRecords into the method, otherwise there's nowhere to call this function conversion. Is the difference not that significant, or did I miss something?

You are right, we need to pass the genericRecords too.
The diff is not that significant here, but it is more pronounced where the conversion is simpler

Add a method to do it now.

pvary · 2026-04-20T11:45:45Z

+    List<Record> expected =
+        IntStream.range(0, genericRecords.size())
+            .mapToObj(
+                i ->
+                    GenericRecord.create(projectionSchema)
+                        .copy(MetadataColumns.FILE_PATH.name(), filePath))
+            .toList();
+
+    readAndAssertMetadataColumn(fileFormat, projectionSchema, idToConstant, expected);


Same pattern than readAndAssertGenericRecords

Got it. Overload readAndAssertMetadataColumn keep the same pattern.

github-actions bot added the data label Mar 18, 2026

Guosmilesmile force-pushed the tck_metadata branch 4 times, most recently from 12a56cf to eca8c4d Compare March 20, 2026 16:46

Guosmilesmile marked this pull request as draft March 21, 2026 15:21

Data: Add TCK tests for Metadata Columns in BaseFormatModelTests

486652f

Guosmilesmile force-pushed the tck_metadata branch from eca8c4d to 486652f Compare March 22, 2026 05:47

github-actions bot added spark flink labels Mar 22, 2026

Guosmilesmile marked this pull request as ready for review March 22, 2026 12:28

pvary reviewed Mar 23, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java Outdated

pvary reviewed Mar 23, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java

pvary reviewed Mar 23, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java

pvary reviewed Mar 23, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java

pvary reviewed Mar 23, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java Outdated

pvary reviewed Mar 23, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java Outdated

pvary reviewed Mar 23, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java Outdated

Guosmilesmile added 2 commits March 24, 2026 19:33

Address Comment

455a30d

fix ci

2fa854a

pvary reviewed Mar 24, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java

pvary reviewed Mar 24, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java Outdated

pvary reviewed Mar 24, 2026

View reviewed changes

Address Comment

d0d35a0

Guosmilesmile force-pushed the tck_metadata branch from 53c0dd2 to d0d35a0 Compare March 25, 2026 11:54

Guosmilesmile added 3 commits March 25, 2026 19:59

fix other version

8ad4e9c

add testReaderSchemaEvolutionNewColumnWithDefault

ac01dbd

remove useless

14d33d0

pvary reviewed Apr 15, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java Outdated

pvary reviewed Apr 15, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java

pvary reviewed Apr 15, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java Outdated

Guosmilesmile force-pushed the tck_metadata branch from 1cc1e7b to 0dad664 Compare April 16, 2026 05:27

Address Comments

4e63bbc

Guosmilesmile force-pushed the tck_metadata branch from 0dad664 to 4e63bbc Compare April 16, 2026 05:34

Guosmilesmile closed this Apr 16, 2026

Guosmilesmile reopened this Apr 16, 2026

pvary reviewed Apr 16, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java

Address Comments

5e3e62b

Guosmilesmile commented Apr 16, 2026

View reviewed changes

Comment thread data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java

Address Comments

e47622e

Guosmilesmile requested a review from pvary April 17, 2026 14:05

Guosmilesmile closed this Apr 17, 2026

Guosmilesmile reopened this Apr 17, 2026

pvary reviewed Apr 20, 2026

View reviewed changes

Overload readAndAssertMetadataColumn

3d7e943

	if (field.initialDefault() != null) {
	throw new UnsupportedOperationException(
	String.format(
	"ORC cannot read default value for field %s (%s): %s",
	root.findColumnName(fieldId), type, field.initialDefault()));
	}

Conversation

Guosmilesmile commented Mar 18, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Guosmilesmile Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Guosmilesmile commented Apr 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Guosmilesmile Apr 16, 2026 •

edited

Loading