Skip to content

feat: implement unlimited pagination for message browsing#2128

Draft
c-julin wants to merge 11 commits intomasterfrom
jc/unlimited-pagination
Draft

feat: implement unlimited pagination for message browsing#2128
c-julin wants to merge 11 commits intomasterfrom
jc/unlimited-pagination

Conversation

@c-julin
Copy link
Contributor

@c-julin c-julin commented Jan 12, 2026

Summary

Implements unlimited pagination for Kafka message browsing, allowing users to navigate through arbitrarily large topics without loading all messages into memory.

Changes

Backend

  • Add stateless page token system using base64-encoded cursors
  • Implement round-robin message distribution across partitions
  • Add pagination fields to ListMessages proto (next_page_token)
  • Integration tests for multi-page scenarios

Frontend

  • Add "Load More" button for paginated message loading
  • Auto-load next page when scrolling to bottom (if no filters/limits)
  • Display loading state and message counts
  • Reset pagination when changing filters or page size

Key Features

  • Stateless pagination (no server-side state)
  • Consistent round-robin distribution ensures fair partition representation
  • Handles dynamic watermark changes between pages
  • Backward compatible with existing single-page mode

@github-actions
Copy link
Contributor

github-actions bot commented Jan 12, 2026

The latest Buf updates on your PR. Results from workflow Buf CI / validate (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJan 23, 2026, 9:29 AM

@c-julin c-julin force-pushed the jc/unlimited-pagination branch from f0131ec to ef4921d Compare January 22, 2026 10:51
Add PageToken struct with encode/decode, validation, and cursor management for multi-partition offset tracking. Supports both ascending and descending order pagination.
Add page_token, page_size, next_page_token, and has_more fields to support cursor-based pagination in ListMessages API.
- Add PageToken and PageSize to ListMessagesRequest
- Pass next_page_token and has_more in completion message
- Validate pagination mode (maxResults=-1) and reject filters
- Add calculateConsumeRequestsWithPageToken with round-robin message distribution
- Support both ascending and descending order pagination
- Reverse messages per-partition for descending order
- Fix offset boundary check (inclusive EndOffset)
- Track finished partitions to prevent over-consumption
- Test first page and multi-page pagination
- Verify descending order within partitions
- Test filter rejection with pagination mode
- Verify legacy mode compatibility
- Add pageToken/nextPageToken to message search state
- Auto-load next page when user reaches last loaded page
- Reset pagination when navigating back to page 1
- Disable sorting in unlimited mode (maxResults=-1)
- Hide unlimited option in Latest/Live mode
- Auto-reset to 50 when switching to Latest/Live from unlimited
…eter

Remove hasMore as an explicit parameter from IListMessagesProgress.OnComplete().
The hasMore field is now automatically derived from whether nextPageToken is empty.

This simplifies the interface by removing redundant state - the information about
whether more pages exist is already encoded in the presence/absence of nextPageToken.

Changes:
- Remove hasMore parameter from OnComplete() interface (4 params -> 3 params)
- Derive HasMore automatically: HasMore = nextPageToken != ""
- Update all call sites to pass 3 parameters instead of 4
- Update integration tests to check nextPageToken instead of hasMore
- Update mocks to match new 3-parameter signature

The protobuf has_more field is still populated for wire compatibility, but is no
longer passed as a separate parameter through the application layers.
- Fix mock expectations in integration tests to use 3 parameters
- Remove HasMore field from stream completion message struct
- Fix page token test to handle unordered partition iteration
Frontend fixes:
- Fix stale closure race condition in auto-load effect by capturing
  messageSearch reference and validating before state update
- Add proper cleanup for loadMore operations on component unmount
- Fix circular effect dependency in page reset by using ref instead
  of state in dependency array
- Add retry limit (max 3 attempts) for loadMore failures to prevent
  infinite retry loops on network errors
- Track mounted state to prevent state updates on unmounted component

Backend fix:
- Change pagination logging from INFO to DEBUG level to reduce
  log verbosity in production
@c-julin c-julin force-pushed the jc/unlimited-pagination branch from ef4921d to 2f955cf Compare January 22, 2026 18:13
The append parameter was referenced but never declared, causing loadMore
to always clear messages instead of appending them.
@malinskibeniamin
Copy link
Contributor

@claude review

@claude
Copy link

claude bot commented Feb 4, 2026

Claude finished @malinskibeniamin's task in 2m 1s —— View job


Code Review Complete ✅

  • Read repository guidelines (CLAUDE.md)
  • Review backend pagination implementation
  • Review integration tests and test coverage
  • Review frontend changes
  • Review proto changes
  • Analyze for bugs, edge cases, and security issues
  • Post comprehensive review feedback

Summary

This is a well-architected implementation of stateless pagination for Kafka message browsing. The approach using base64-encoded page tokens is solid, and the round-robin distribution ensures fair partition representation. The code is well-tested with comprehensive integration tests. However, I've identified several issues and improvements across the implementation.

Critical Issues

1. Race Condition in Offset Calculation (backend/pkg/console/list_messages.go:1081)

Issue: The condition record.Offset >= partitionReq.EndOffset stops reading too early because EndOffset is inclusive in the consume request but this comparison treats it as exclusive.

// Current code (line 1081)
if record.Offset >= partitionReq.EndOffset {

Problem: If EndOffset is 100, a message at offset 100 should be consumed, but this check would mark the partition as finished before consuming it. This causes off-by-one errors in message counts.

Fix: Change to > since EndOffset is inclusive:

if record.Offset > partitionReq.EndOffset {

2. Potential Integer Overflow in Token Validation (backend/pkg/console/page_token.go:104)

Issue: The page size validation uses hardcoded 500 but doesn't account for potential multiplication with partition count in memory calculations.

if pt.PageSize <= 0 || pt.PageSize > 500 {

Concern: With many partitions (e.g., 1000 partitions × 500 messages), the system could attempt to fetch 500,000 messages in memory. This could cause OOM issues.

Recommendation: Add a warning or consider validating PageSize * PartitionCount doesn't exceed a safe threshold (e.g., 10,000 total messages).

3. Unchecked Error in Frontend LoadMore (frontend/src/components/pages/topics/Tab.Messages/index.tsx:759)

Issue: The loadMore() call doesn't validate that the search is still valid before appending messages.

capturedMessageSearch
  .loadMore()
  .then(() => {
    if (isMountedRef.current && currentMessageSearchRef.current === capturedMessageSearch) {
      setMessages([...capturedMessageSearch.messages]);

Problem: If the search parameters changed between pages, stale data could be displayed. The check for currentMessageSearchRef.current === capturedMessageSearch helps, but messages could already be appended to capturedMessageSearch.messages before this check.

Recommendation: Add validation that search parameters haven't changed, or clear messages on parameter change.

High Priority Issues

4. Missing Watermark Update Validation (backend/pkg/console/list_messages.go:598)

Issue: When watermarks change between pages, the code only checks if partition is exhausted but doesn't validate that NextOffset is still within the new watermark range.

if token.Direction == DirectionDescending {
    isExhausted = cursor.NextOffset < updatedLowWaterMark
} else {
    isExhausted = cursor.NextOffset >= updatedHighWaterMark
}

Problem: If low watermark moved forward due to compaction, NextOffset could be valid but pointing to already-deleted data, causing read errors.

Fix: Add validation and adjustment:

if token.Direction == DirectionDescending {
    if cursor.NextOffset < updatedLowWaterMark {
        isExhausted = true
    }
} else {
    if cursor.NextOffset < updatedLowWaterMark {
        // Adjust NextOffset to new low watermark
        cursor.NextOffset = updatedLowWaterMark
    }
    if cursor.NextOffset >= updatedHighWaterMark {
        isExhausted = true
    }
}

5. Inconsistent Direction Validation (backend/pkg/console/list_messages.go:250-254)

Issue: The direction determination logic only considers StartOffsetOldest for ascending, defaulting everything else to descending.

direction := DirectionDescending
if listReq.StartOffset == StartOffsetOldest {
    direction = DirectionAscending
}

Problem: This is inconsistent with the PR description stating "Newest messages first" for descending. If StartOffset is a specific offset (not -2 or -3), the user's intended direction is unclear.

Recommendation:

  • Document this behavior clearly in the proto file
  • Consider adding an explicit direction field to ListMessagesRequest for clarity

6. Memory Leak in Frontend Auto-Load (frontend/src/components/pages/topics/Tab.Messages/index.tsx:732-801)

Issue: The auto-load effect doesn't clean up the abort controller if the effect re-runs before the previous load completes.

useEffect(() => {
    // ... auto-load logic
    const abortController = new AbortController();
    loadMoreAbortRef.current = abortController;

Problem: If the effect re-runs (e.g., pageIndex changes), a new abort controller is created but the old one isn't aborted, potentially leaving orphaned requests.

Fix: Add cleanup:

useEffect(() => {
    return () => {
        if (loadMoreAbortRef.current) {
            loadMoreAbortRef.current.abort();
            loadMoreAbortRef.current = null;
        }
    };
}, [pageIndex, maxResults, /* other deps */]);

Medium Priority Issues

7. Token Encoding Size Not Validated (backend/pkg/console/page_token.go:41)

Issue: No validation that the encoded token size is reasonable for URL transmission.

Concern: With 1000 partitions, the token could be several KB, potentially exceeding URL length limits or gRPC metadata limits.

Recommendation: Add size check after encoding and return error if token exceeds ~8KB.

8. Missing Test Coverage for Edge Cases (backend/pkg/console/page_token_test.go)

Observed: Tests cover basic scenarios well, but missing tests for:

  • Watermark changes between pages (compaction scenarios)
  • Token size limits with many partitions
  • Direction switching mid-pagination
  • Partition count changes mid-pagination (currently returns error, but no test validates the error message)

Recommendation: Add integration tests for these scenarios.

9. Frontend Retry Logic Could Loop Indefinitely (frontend/src/components/pages/topics/Tab.Messages/index.tsx:741)

Issue: The retry counter loadMoreFailures is never reset to 0 except on success.

if (loadMoreFailures >= MAX_LOAD_MORE_RETRIES) {
    return;
}

Problem: If load fails 3 times, no more auto-loading will occur even if the user navigates away and back. The counter is only reset on success or when resetting pagination (line 813).

Recommendation: Reset loadMoreFailures when search parameters change or user manually triggers search.

10. Descending Order Reversal Could Be Memory Intensive (backend/pkg/console/list_messages.go:877-893)

Issue: Messages are collected in memory per-partition and then reversed.

messagesPerPartition := make(map[int32][]*TopicMessage)
// ... collect messages ...
// Reverse messages in each partition
for partitionID, messages := range messagesPerPartition {

Concern: With 500 messages × 100 partitions, this could temporarily double memory usage during reversal.

Recommendation: Consider reversing as messages are sent rather than collecting all first, or use a more memory-efficient reversal method.

Low Priority / Style Issues

11. Magic Numbers in Tests (backend/pkg/api/handle_topic_messages_integration_test.go:1071)

const messageCount = 150

Recommendation: Extract to a constant with clear name explaining why 150 (e.g., testMessageCountForPagination).

12. Inconsistent Error Messages (backend/pkg/console/page_token.go)

Observation: Some validation errors include context (partition ID, values) while others don't.

Example: Line 117 includes partition info, but line 84 doesn't include the actual partition count.

Recommendation: Standardize error messages to always include relevant context values.

13. Frontend: Unused pageToken in Legacy Mode (frontend/src/state/backend-api.ts:3160)

Issue: When calling loadMore(), the code creates a new request with pageToken: this.nextPageToken, but this path is only reachable in pagination mode (maxResults === -1).

Observation: Not a bug per se, but the code structure could be clearer about mode separation.

Positive Observations

  1. Excellent Test Coverage: Integration tests thoroughly validate multi-page scenarios and edge cases.

  2. Stateless Design: The page token approach is well-implemented and allows for horizontal scaling.

  3. Round-Robin Distribution: Fair partition representation is correctly implemented.

  4. Error Handling: Comprehensive error handling throughout, with proper validation.

  5. Backward Compatibility: Legacy mode preserved, ensuring existing functionality continues to work.

  6. Frontend UX: Auto-loading and manual "Load More" provide good user experience.

Recommendations for Follow-up

  1. Add monitoring/metrics for:

    • Token decode failures
    • Watermark change frequency
    • Average pages per search session
    • Token size distribution
  2. Consider adding a max_pages parameter to prevent excessive pagination.

  3. Document pagination behavior in user-facing documentation.

  4. Add E2E tests for frontend pagination flow.


Overall Assessment: This is a solid implementation with good test coverage. The critical issues (#1, #2) should be addressed before merging. The high-priority issues (#4-6) are important for production reliability. Medium and low priority items can be addressed in follow-up PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants