[SPARK-55622][SQL][TESTS] Add test for DSV2 Tables with multi-part names on SessionCatalog by szehon-ho · Pull Request #54411 · apache/spark

szehon-ho · 2026-02-21T00:44:19Z

What changes were proposed in this pull request?

Add a unit test for Iceberg's case of supporting multi part identifiers in SessionCatalog (for metadata tables). Add a fake metadata table to InMemoryDataSource.

Why are the changes needed?

It can increase Spark coverage to catch issue like: #54247

Does this PR introduce any user-facing change?

No

How was this patch tested?

Ran the added test

Was this patch authored or co-authored using generative AI tooling?

Yes, cursor claude 4.5 opus

…nCatalog

szehon-ho · 2026-02-21T00:46:06Z

cc @cloud-fan @manuzhang @pan3793 do you think this will help? Thanks

pan3793 · 2026-02-21T07:16:13Z

sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSessionCatalogSuite.scala

+      sql(s"INSERT INTO $t1 VALUES (3, 'third')")
+
+      // Query the metadata table using multi-part identifier
+      val snapshots = sql(s"SELECT * FROM default.$t1.snapshots")


do you also want to test

SELECT * FROM $t1.snapshots; SELECT * FROM spark_catalog.default.$t1.snapshots

?

pan3793 · 2026-02-21T07:28:43Z

sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryTable.scala

+    StructField("snapshot_id", LongType, nullable = false)
+  ))
+
+  override def capabilities(): util.Set[TableCapability] = {


a couple of questions that mgiht go beyond this topic:

seems there is no "metadata table" concept in Spark so far, do we need a new TableCapability to signal that a table supports that?

~~how should the permission check be done for metadata tables? suppose they're read-only, does it mean users have read permission on metadata tables as long as they are granted to read the base table?~~

Yea, as I guess you know, Iceberg implements the metadata table behind loadTable, matching the tableIdentifier with the form $table.$metadata_table_name.

It'd be nice for Spark to support metadata table officially in DSV2 one day. Then its more clear to Spark code base.

Until then, this test mocks current Iceberg behavior, is because Spark is often not aware of this unexpected behavior in Iceberg and breaks it. So its a regression test for #54247 case.

By the way, what do you mean read only? (I dont think there's such DSV2 concept yet?)

btw, I also changed the test name / comments and JIRA title to reflect that it's more about supporting DSV2 loadTable with complex table name, and metadata table is just an example.

By the way, what do you mean read only? (I dont think there's such DSV2 concept yet?)

Just ignore it, I might be getting bogged down in the implementation details of Iceberg. It's an invalid question for a general multi-part namespace table name, from Spark DSv2 perspective.

pan3793 · 2026-02-21T11:08:28Z

...src/test/scala/org/apache/spark/sql/connector/DataSourceV2DataFrameSessionCatalogSuite.scala

  override def loadTable(ident: Identifier): Table = {
+    // Check for metadata table pattern: namespace = [db, tableName], name = "snapshots"
+    // This simulates how Iceberg handles metadata tables like db.table.snapshots
+    if (ident.name() == "snapshots" && ident.namespace().length >= 1) {


identToUse should be used instead of ident, otherwise, customIdentifierResolution does not take effect.

I checked the iceberg code base, it first tries to load the ident as the table, fallback to resolve metadata table only when NoSuchTableException, which is more reasonable

done. In Iceberg, it catch the Iceberg NoSuchTableException.

one small difference, our test V2SessionCatalog is not as detailed , so I end up catching AnalysisException, else we have to implement all the wrapper exceptions.

pan3793 · 2026-02-21T11:10:20Z

sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSessionCatalogSuite.scala

+      verify(sql(s"SELECT * FROM $t1.snapshots"), "table.snapshots")
+      verify(sql(s"SELECT * FROM default.$t1.snapshots"), "default.table.snapshots")
+      verify(
+        sql(s"SELECT * FROM spark_catalog.default.$t1.snapshots"),
+        "spark_catalog.default.table.snapshots")
+    }


Seq("$t1.snapshots", "default.$t1.snapshots", "spark_catalog.default.$t1.snapshots").foreach { tblName => verify ... }

pan3793 · 2026-02-24T01:09:22Z

...src/test/scala/org/apache/spark/sql/connector/DataSourceV2DataFrameSessionCatalogSuite.scala

+    try {
+      super.loadTable(identToUse)
+    } catch {
+      case _: AnalysisException if identToUse.name().toLowerCase(Locale.ROOT) == "snapshots" =>


the api say it should throw NoSuchTableException, but actually AnalysisException?

spark/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCatalog.java

Lines 140 to 150 in 5059bab

/**

* Load table metadata by {@link Identifier identifier} from the catalog.

* <p>

* If the catalog supports views and contains a view for the identifier and not a table, this

* must throw {@link NoSuchTableException}.

*

* @param ident a table identifier

* @return the table's metadata

* @throws NoSuchTableException If the table doesn't exist or is a view

*/

Table loadTable(Identifier ident) throws NoSuchTableException;

So in this code path, we get AnlaysisException like:

[REQUIRES_SINGLE_PART_NAMESPACE] spark_catalog requires a single-part namespace, but got identifier `default`.`metadata_test_tbl`.`snapshots`. SQLSTATE: 42K05 org.apache.spark.sql.AnalysisException: [REQUIRES_SINGLE_PART_NAMESPACE] spark_catalog requires a single-part namespace, but got identifier `default`.`metadata_test_tbl`.`snapshots`. SQLSTATE: 42K05 at org.apache.spark.sql.errors.QueryCompilationErrors$.requiresSinglePartNamespaceError(QueryCompilationErrors.scala:1550) at org.apache.spark.sql.connector.catalog.CatalogV2Implicits$IdentifierHelper.asTableIdentifier(CatalogV2Implicits.scala:171) at org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog.loadTable(V2SessionCatalog.scala:91) at org.apache.spark.sql.connector.catalog.DelegatingCatalogExtension.loadTable(DelegatingCatalogExtension.java:73) at org.apache.spark.sql.connector.InMemoryTableSessionCatalog.org$apache$spark$sql$connector$TestV2SessionCatalogBase$$super$loadTable(DataSourceV2DataFrameSessionCatalogSuite.scala:103) at org.apache.spark.sql.connector.TestV2SessionCatalogBase.loadTable(TestV2SessionCatalogBase.scala:69) at org.apache.spark.sql.connector.TestV2SessionCatalogBase.loadTable$(TestV2SessionCatalogBase.scala:62) at org.apache.spark.sql.connector.InMemoryTableSessionCatalog.loadTable(DataSourceV2DataFrameSessionCatalogSuite.scala:120) at org.apache.spark.sql.connector.catalog.CatalogV2Util$.getTable(CatalogV2Util.scala:483) at org.apache.spark.sql.connector.catalog.CatalogV2Util$.loadTable(CatalogV2Util.scala:458) at org.apache.spark.sql.catalyst.analysis.RelationResolution.$anonfun$resolveRelation$4(RelationResolution.scala:131) at scala.Option.orElse(Option.scala:477)

Probably we can change the TestV2SessionCatalog (or V2SessionCatalog itself) to catch it and return the right error, but it may be too much.

pan3793

LGTM, only a query.

[SPARK-55622][SQL][TESTS] Add test for DSV2 Metadata Tables on Sessio…

3ee10c5

…nCatalog

pan3793 reviewed Feb 21, 2026

View reviewed changes

Review comments

edc27e9

szehon-ho changed the title ~~[SPARK-55622][SQL][TESTS] Add test for DSV2 Metadata Tables on SessionCatalog~~ [SPARK-55622][SQL][TESTS] Add test for DSV2 Tables with multi-part names on SessionCatalog Feb 21, 2026

pan3793 reviewed Feb 21, 2026

View reviewed changes

cloud-fan approved these changes Feb 22, 2026

View reviewed changes

szehon-ho added 2 commits February 23, 2026 12:21

Review comment

fff88bf

More review comment

4de98c7

pan3793 reviewed Feb 24, 2026

View reviewed changes

pan3793 approved these changes Feb 24, 2026

View reviewed changes

	/**
	* Load table metadata by {@link Identifier identifier} from the catalog.
	* <p>
	* If the catalog supports views and contains a view for the identifier and not a table, this
	* must throw {@link NoSuchTableException}.
	*
	* @param ident a table identifier
	* @return the table's metadata
	* @throws NoSuchTableException If the table doesn't exist or is a view
	*/
	Table loadTable(Identifier ident) throws NoSuchTableException;

Comments

Conversation

szehon-ho commented Feb 21, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

szehon-ho commented Feb 21, 2026

Uh oh!

pan3793 Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

szehon-ho Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

pan3793 Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

szehon-ho Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

szehon-ho Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pan3793 Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pan3793 Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

szehon-ho Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pan3793 Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

szehon-ho Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

pan3793 Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

szehon-ho Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pan3793 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pan3793 Feb 21, 2026 •

edited

Loading

szehon-ho Feb 21, 2026 •

edited

Loading

szehon-ho Feb 21, 2026 •

edited

Loading

pan3793 Feb 21, 2026 •

edited

Loading

szehon-ho Feb 23, 2026 •

edited

Loading

szehon-ho Feb 24, 2026 •

edited

Loading