Skip to content

[spark] Add basic streaming read support for sparksql with latest mode#2548

Merged
wuchong merged 18 commits intoapache:mainfrom
Yohahaha:spark-streaming-read
Feb 8, 2026
Merged

[spark] Add basic streaming read support for sparksql with latest mode#2548
wuchong merged 18 commits intoapache:mainfrom
Yohahaha:spark-streaming-read

Conversation

@Yohahaha
Copy link
Contributor

@Yohahaha Yohahaha commented Feb 2, 2026

Purpose

Linked issue: close #2557

Brief change log

Add FlussMicroBatchStream to support micro-batch streaming execution, this PR only includes latest mode support, left several TODO for mode startup mode.

Tests

org.apache.fluss.spark.SparkStreamingTest

API and Format

Documentation

@Yohahaha Yohahaha force-pushed the spark-streaming-read branch from e4583e0 to 8bd5e16 Compare February 3, 2026 07:14
@Yohahaha Yohahaha changed the title [WIP][spark] Add streaming read for sparksql [spark] Add streaming read for sparksql Feb 3, 2026
@Yohahaha
Copy link
Contributor Author

Yohahaha commented Feb 3, 2026

based on #2532 for startup mode.

@Yohahaha Yohahaha changed the title [spark] Add streaming read for sparksql [spark] Add basic streaming read support for sparksql with latest mode Feb 3, 2026
@Yohahaha Yohahaha force-pushed the spark-streaming-read branch 2 times, most recently from 9ec4e13 to a0f9f1b Compare February 4, 2026 04:09
@wuchong
Copy link
Member

wuchong commented Feb 5, 2026

@Yohahaha #2532 is merged, could you rebase and resolve the failed CI?

@Yohahaha Yohahaha force-pushed the spark-streaming-read branch from a0f9f1b to 2020a98 Compare February 5, 2026 06:14
}

private def getOrCreateInitialPartitionOffsets(): TableBucketOffsets = {
// TODO load from checkpoint dir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fluss doesn't have to load offset from checkpoint path, instead spark does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thank you!

@YannByron
Copy link
Contributor

+1. Let adjust the streaming test cases when we have more discussion offline in the following pr.

@wuchong wuchong merged commit 5d7630a into apache:main Feb 8, 2026
6 checks passed
@Yohahaha Yohahaha deleted the spark-streaming-read branch February 10, 2026 05:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[spark] Support basic streaming read with latest startup mode

3 participants