Skip to content

Commit d6bb913

Browse files
sumedhsakdeoclaude
andcommitted
fix: validate ArrivalOrder params and clarify ordering docs
- Add __post_init__ to ArrivalOrder raising ValueError if concurrent_streams < 1 or max_buffered_batches < 1. Previously max_buffered_batches=0 would silently create an unbounded queue. - Split the ArrivalOrder row in the ordering semantics table to clarify that interleaving only occurs with concurrent_streams > 1; concurrent_streams=1 reads files sequentially. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent a882dd2 commit d6bb913

File tree

2 files changed

+8
-1
lines changed

2 files changed

+8
-1
lines changed

mkdocs/docs/api.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -378,7 +378,8 @@ for buf in tbl.scan().to_arrow_batch_reader(order=ArrivalOrder(concurrent_stream
378378
| Configuration | File ordering | Within-file ordering |
379379
|---|---|---|
380380
| `TaskOrder()` (default) | Batches grouped by file, in task submission order | Row order |
381-
| `ArrivalOrder()` | Interleaved across files (no grouping guarantee) | Row order within each file |
381+
| `ArrivalOrder(concurrent_streams=1)` | Sequential, one file at a time | Row order |
382+
| `ArrivalOrder(concurrent_streams>1)` | Interleaved across files (no grouping guarantee) | Row order within each file |
382383

383384
The `limit` parameter is enforced correctly regardless of configuration.
384385

pyiceberg/table/__init__.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,12 @@ class ArrivalOrder(ScanOrder):
193193
batch_size: int | None = None
194194
max_buffered_batches: int = 16
195195

196+
def __post_init__(self) -> None:
197+
if self.concurrent_streams < 1:
198+
raise ValueError(f"concurrent_streams must be >= 1, got {self.concurrent_streams}")
199+
if self.max_buffered_batches < 1:
200+
raise ValueError(f"max_buffered_batches must be >= 1, got {self.max_buffered_batches}")
201+
196202

197203
@dataclass()
198204
class UpsertResult:

0 commit comments

Comments
 (0)