feat: implement native S3 write support by kazantsev-maksim · Pull Request #4547 · apache/datafusion-comet

kazantsev-maksim · 2026-05-31T15:33:06Z

Which issue does this PR close?

Part of: #1625

Rationale for this change

Currently, when Comet executes ETL queries that read from Parquet, perform a transformation, and then write back to Parquet (or S3-backed Parquet), a columnar-to-row conversion is required before the write step, because the write path falls back to the JVM Spark writer. This conversion adds unnecessary overhead and negates the performance benefits of native execution.

What changes are included in this PR?

native/core/src/execution/operators/parquet_writer.rs — Extended the native Parquet writer to support writing to S3-compatible object storage. Added S3 object store registration and wired it into the DataFusion execution context so that output paths with s3:// / s3a:// schemes are handled natively via the object_store crate.
spark/src/main/scala/org/apache/comet/serde/operator/CometDataWritingCommand.scala — Updated the Scala-side CometDataWritingCommand to detect S3/S3A output paths and route them through the new native write code path instead of delegating to the JVM Spark writer. Passes the necessary S3 credentials and configuration from Hadoop/Spark config to the native layer.

How are these changes tested?

local testing

This reverts commit 768b3e9.

# Conflicts: # native/Cargo.lock # native/core/Cargo.toml

kazantsev-maksim · 2026-06-07T14:04:12Z

@andygrove @mbutrovich @comphead Could you please give feedback on these changes – do they make sense to you?

mbutrovich · 2026-06-08T13:07:10Z

Are there any limitations from doing it with object_store instead of opendal? I'd like to stop using both in Comet, eventually.

comphead

Thanks @kazantsev-maksim I think we need to proceed with #3209 first to make sure writer works properly with Spark tests. I'll prioritize it this week!

Kazantsev Maksim and others added 30 commits December 14, 2025 16:24

impl map_from_entries

768b3e9

Revert "impl map_from_entries"

c68c342

This reverts commit 768b3e9.

Merge branch 'apache:main' into main

d887555

Merge branch 'apache:main' into main

231aa90

Merge branch 'apache:main' into main

9500bbb

Merge branch 'apache:main' into main

9577481

Merge branch 'apache:main' into main

3791557

Merge branch 'apache:main' into main

7c2f082

Merge branch 'apache:main' into main

609a605

Merge branch 'apache:main' into main

a151b2c

Merge branch 'apache:main' into main

ad3e7f5

Merge branch 'apache:main' into main

ea92e4b

Merge branch 'apache:main' into main

8dfeca3

Merge branch 'apache:main' into main

559741e

Merge branch 'apache:main' into main

ebda14e

Merge branch 'apache:main' into main

408152e

Merge branch 'apache:main' into main

d7857b2

Merge branch 'apache:main' into main

aef41be

Merge branch 'apache:main' into main

5ac1c58

Merge branch 'apache:main' into main

9ae8e23

Merge branch 'apache:main' into main

5ca3888

Merge branch 'apache:main' into main

160a817

Merge branch 'apache:main' into main

88fc313

Merge branch 'apache:main' into main

e14c180

Merge branch 'apache:main' into main

610a885

Merge branch 'apache:main' into main

f8acb2c

Merge branch 'apache:main' into main

ec94897

Merge branch 'apache:main' into main

43405e4

Merge branch 'apache:main' into main

47b4915

Merge branch 'apache:main' into main

26e2682

kazantsev-maksim and others added 20 commits April 8, 2026 19:59

Merge branch 'apache:main' into main

561a664

Merge branch 'apache:main' into main

d926ef4

Merge branch 'apache:main' into main

671412c

Merge branch 'apache:main' into main

c9f52d1

Merge branch 'apache:main' into main

67f72d9

Merge branch 'apache:main' into main

314e594

Merge branch 'apache:main' into main

ac8292f

WIP

9da0edb

work

f2bce23

Merge branch 'apache:main' into main

c9c140e

Merge branch 'apache:main' into main

decca58

Merge branch 'apache:main' into main

0919b33

Merge branch 'apache:main' into main

7495e21

Merge branch 'apache:main' into main

0a37a60

Merge branch 'apache:main' into main

abbba84

Merge branch 'apache:main' into main

6020560

Merge remote-tracking branch 'origin/main' into support_native_s3_write

838dcb9

# Conflicts: # native/Cargo.lock # native/core/Cargo.toml

Merge branch 'apache:main' into main

e2bdfb1

Complete native s3 write draft feature

2b2acb1

Merge remote-tracking branch 'origin/main' into support_native_s3_write

f47426b

kazantsev-maksim marked this pull request as draft May 31, 2026 15:33

kazantsev-maksim changed the title ~~feat: Native s3 write support~~ feat: implement native S3 write support May 31, 2026

kazantsev-maksim and others added 4 commits June 2, 2026 15:00

Merge branch 'main' into support_native_s3_write

abe1a8e

refactoring

981a1f5

fmt

b487f91

refactoring

0d074e3

kazantsev-maksim marked this pull request as ready for review June 7, 2026 14:03

comphead reviewed Jun 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement native S3 write support#4547

feat: implement native S3 write support#4547
kazantsev-maksim wants to merge 58 commits into
apache:mainfrom
kazantsev-maksim:support_native_s3_write

kazantsev-maksim commented May 31, 2026 •

edited

Loading

Uh oh!

kazantsev-maksim commented Jun 7, 2026

Uh oh!

mbutrovich commented Jun 8, 2026

Uh oh!

comphead left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kazantsev-maksim commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

kazantsev-maksim commented Jun 7, 2026

Uh oh!

mbutrovich commented Jun 8, 2026

Uh oh!

comphead left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kazantsev-maksim commented May 31, 2026 •

edited

Loading