Fix IndexOutOfBoundsException in Arrow Metadata Access for DDL Statements #1186

sreekanth-db · 2026-01-23T08:49:31Z

Description

Fixed IndexOutOfBoundsException that occurs when executing DDL statements (e.g., CREATE DATABASE) using the Thrift protocol. The bug manifests when there's a mismatch between the number of Thrift column descriptors and Arrow schema fields.

Root Cause

When executing DDL statements, the Databricks server behavior is:

Thrift Protocol: Returns column descriptors including a "Result" status column (1 column)
Arrow Schema: Returns an empty schema with 0 fields (no actual data)
The Bug: Code attempted to access arrowMetadata[0] without checking if the list was empty

This mismatch caused IndexOutOfBoundsException when the driver tried to access arrow metadata at index 0 of an empty list.

Debug Evidence

TColumnDesc (Thrift):

Column[0]:
  name: Result
  type: STRING_TYPE
  position: 1
  Full TColumnDesc: TColumnDesc(columnName:Result, typeDesc:TTypeDesc(...), position:1, comment:)

Arrow Schema:

Arrow schema bytes length: 72
Deserialized Arrow schema, field count: 0  ← Empty!
Arrow metadata list: size=0

Changes Made

Added bounds checking in two locations where arrow metadata is accessed:

ArrowUtil.java:247 - Used by StreamingInlineArrowResult
DatabricksResultSetMetaData.java:195 - Used for result set metadata construction

Before:

String columnArrowMetadata =
    arrowMetadata != null ? arrowMetadata.get(columnIndex) : null;

After:

String columnArrowMetadata =
    arrowMetadata != null && columnIndex < arrowMetadata.size()
        ? arrowMetadata.get(columnIndex)
        : null;

Testing

Manual Testing

Test Case: Execute CREATE DATABASE statement

String sqlQuery = "CREATE DATABASE IF NOT EXISTS hive_metastore.test_db";
boolean hasResultSet = stmt.execute(sqlQuery);

Before Fix: IndexOutOfBoundsException: Index 0 out of bounds for length 0
After Fix: Executes successfully, returns hasResultSet=false

Additional Notes to the Reviewer

NO_CHANGELOG=true

…ents When executing DDL statements (like CREATE DATABASE), the Thrift protocol returns column descriptors but the Arrow schema is empty. This caused IndexOutOfBoundsException when accessing arrowMetadata list without bounds checking. Changes: - Add bounds check in DatabricksResultSetMetaData.java (line 195) - Add bounds check in ArrowUtil.java (line 247) Both locations now verify columnIndex < arrowMetadata.size() before accessing the list to handle cases where Thrift column count != Arrow schema field count. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

vikrantpuppala

thanks for the fix, can we add ddl commands into any of the repo tests so that such failures are caught earlier?

sreekanth-db requested review from msrathore-db and vikrantpuppala January 23, 2026 08:49

vikrantpuppala approved these changes Jan 23, 2026

View reviewed changes

sreekanth-db merged commit 1893a40 into databricks:main Jan 23, 2026
12 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix IndexOutOfBoundsException in Arrow Metadata Access for DDL Statements #1186

Fix IndexOutOfBoundsException in Arrow Metadata Access for DDL Statements #1186

sreekanth-db commented Jan 23, 2026

Uh oh!

vikrantpuppala left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix IndexOutOfBoundsException in Arrow Metadata Access for DDL Statements #1186

Fix IndexOutOfBoundsException in Arrow Metadata Access for DDL Statements #1186

Conversation

sreekanth-db commented Jan 23, 2026

Description

Root Cause

Debug Evidence

Changes Made

Testing

Manual Testing

Additional Notes to the Reviewer

Uh oh!

vikrantpuppala left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants