Skip to content

Worker responsibility ranges#25

Merged
dwerner merged 7 commits intomainfrom
segment-worker-debug
Nov 6, 2025
Merged

Worker responsibility ranges#25
dwerner merged 7 commits intomainfrom
segment-worker-debug

Conversation

@dwerner
Copy link
Collaborator

@dwerner dwerner commented Nov 6, 2025

No description provided.

- Add grpc_request_duration_blocks histogram metric
- Add grpc_request_duration_transactions histogram metric
- Add grpc_request_duration_logs histogram metric
- Track per-segment request latency for blocks, transactions, logs
- Histogram buckets: 1ms to 1 hour for comprehensive latency tracking

Enables detailed performance analysis of bridge request patterns and helps
identify slow segments or data types during sync operations.
- Track start/end block in SegmentWorkerState
- Encode responsibility range as (u64, u64) in FlightData.app_metadata
- Include range even when batch has 0 rows (empty blocks)
- Wire up metrics for block/transaction/log request durations

Bridge now tells phaser-query which blocks were checked, not just which
blocks had data. This enables gap detection to distinguish between
'unchecked blocks' and 'checked but empty blocks'.
- Replace StreamReader with proper FlightData decoding
- Use arrow_ipc::root_as_message to extract schema from first message
- Use arrow_flight::utils::flight_data_to_arrow_batch per message
- Preserve app_metadata access for responsibility range tracking
- Remove unused subscribe() method (replaced by subscribe_with_metadata)
- Increase max message size to 256MB for large batches

Bug: Code incorrectly used StreamReader::try_new() which expected complete
IPC streams, but FlightData contains single RecordBatch messages. This
caused RangeOutOfBounds errors that blocked all syncing.

Fix: Decode each FlightData as a single message using arrow_flight::utils,
matching the FlightRecordBatchStream implementation pattern.
- Add update_responsibility_end() to accumulate max responsibility
- Simplify finalize_current_file() to use full responsibility range
- Remove per-file responsibility_start (tracked at writer level)
- Always write responsibility_end from bridge, not data_end

Parquet files now claim responsibility for checked-but-empty blocks,
enabling gap scanner to correctly report completion status.
- Use subscribe_with_metadata() for blocks, transactions, logs
- Extract responsibility range from each batch
- Call writer.update_responsibility_end() as batches arrive
- Add StreamingNotInitialized error for timeout diagnostics
- Add is_streaming_enabled check in sync service

Workers now accumulate responsibility ranges from bridge metadata and
update parquet writers accordingly. Gap scanner can now distinguish
between missing data and legitimately empty blocks.
- Use subscribe_with_metadata() in spawn_stream_processor
- Extract responsibility ranges from batches (currently unused)
- Add TODO for future responsibility tracking in live mode

Maintains compatibility with new subscription API. Responsibility tracking
in live mode will be implemented in future work.
- Add BatchMetadata struct with ResponsibilityRange field
- Document extensibility pattern for future metadata fields
- Add encode/decode methods with proper error handling
- Update subscribe_with_metadata() to return BatchMetadata (not Optional)
- Update all workers to use BatchMetadata instead of Option<(u64, u64)>
- Update bridge to use encode_metadata()
- Deprecate old tuple-based encode/decode methods

Benefits:
- Type safety: ResponsibilityRange is now required, not optional
- Extensibility: Can add new fields (compression_ratio, split_index, etc.)
- Clear contract: subscribe_with_metadata() guarantees metadata or fails
- Better documentation: Comments explain usage and evolution

The BatchMetadata struct is designed to grow over time. See lib.rs for
guidelines on adding new fields while maintaining compatibility.
@dwerner dwerner merged commit a1b0524 into main Nov 6, 2025
5 checks passed
@dwerner dwerner deleted the segment-worker-debug branch November 6, 2025 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant