Skip to content

fix(connectors): replace overloaded InvalidRecord with distinct error variants#3194

Open
atharvalade wants to merge 5 commits intoapache:masterfrom
atharvalade:fix/distinct-iceberg-sink-error-variants
Open

fix(connectors): replace overloaded InvalidRecord with distinct error variants#3194
atharvalade wants to merge 5 commits intoapache:masterfrom
atharvalade:fix/distinct-iceberg-sink-error-variants

Conversation

@atharvalade
Copy link
Copy Markdown
Contributor

@atharvalade atharvalade commented Apr 29, 2026

Which issue does this PR close?

Closes #3176

Rationale

Error::InvalidRecord was used for five unrelated failure modes in the Iceberg sink's write_data function, making it impossible for callers to distinguish schema mismatches from I/O failures from catalog outages.

What changed?

The Iceberg sink mapped Arrow schema conversion errors, Parquet write failures, and Iceberg catalog transaction failures all to Error::InvalidRecord. Callers could not programmatically decide whether to fix a table definition, skip a corrupt message, or retry a catalog outage.

Three new SDK error variants — SchemaMismatch(String), WriteFailure(String), CatalogError(String) — replace the overloaded InvalidRecord at the appropriate call sites. InvalidRecord is preserved only for the genuine record-batch deserialization error.

Local Execution

  • Passed
  • Pre-commit hooks ran

AI Usage

  • Opu 4.6
  • Writing comments, writing PR Description
  • Yes

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.49%. Comparing base (93227c1) to head (e3def27).

Files with missing lines Patch % Lines
...re/connectors/sinks/iceberg_sink/src/router/mod.rs 0.00% 5 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             master    #3194       +/-   ##
=============================================
- Coverage     74.10%   58.49%   -15.62%     
  Complexity      943      943               
=============================================
  Files          1164     1163        -1     
  Lines        103048    92923    -10125     
  Branches      80081    69957    -10124     
=============================================
- Hits          76368    54351    -22017     
- Misses        23995    35971    +11976     
+ Partials       2685     2601       -84     
Components Coverage Δ
Rust Core 54.49% <0.00%> (-20.83%) ⬇️
Java SDK 60.14% <ø> (ø)
C# SDK 69.38% <ø> (-0.06%) ⬇️
Python SDK 81.43% <ø> (ø)
Node SDK 91.40% <ø> (-0.13%) ⬇️
Go SDK 39.60% <ø> (ø)
Files with missing lines Coverage Δ
core/connectors/sdk/src/lib.rs 56.17% <ø> (ø)
...re/connectors/sinks/iceberg_sink/src/router/mod.rs 39.23% <0.00%> (ø)

... and 248 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread core/connectors/sdk/src/lib.rs Outdated
#[error("Write failure: {0}")]
WriteFailure(String),
/// A catalog or transaction-level failure (e.g. applying or committing an
/// Iceberg transaction). Callers may retry on transient catalog outages.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc "callers may retry on transient catalog outages" is misleading. action.apply() at router/mod.rs:213 is in-memory transaction prep - deterministic failures (invalid partition spec, schema validation) cannot be retried. only tx.commit(catalog) at router/mod.rs:222 hits the network. suggest dropping the retry claim, or splitting into ApplyError (deterministic) vs CommitError (transient-eligible).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error::InvalidRecord is overloaded across unrelated failure modes

3 participants