Optimize CloseInactiveSessionsJob: batch Redis calls#2119
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes the CloseInactiveSessionsJob by batching Redis cache operations to reduce network roundtrips and trace noise. Previously, the job made 100–300 sequential GET requests per page of sessions (one per session for heartbeat data and close flags). The new implementation consolidates these into just 2 MGET calls per page using the cache client's GetAllAsync method.
Changes:
- Replaced per-session
GetHeartbeatAsyncmethod with batch-orientedGetHeartbeatsBatchAsyncthat processes an entire page at once - Changed from Dictionary keyed by domain objects to index-aligned arrays for tracking heartbeat results
- Maintained all existing business logic including session ID priority over user identity fallback, duplicate detection, and close flag handling
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ach loop Address PR feedback: - Remove weird IReadOnlyList cast at call site - GetHeartbeatsBatchAsync now returns (PersistentEvent, HeartbeatResult?) tuples - Caller uses simple foreach instead of index-based for loop - Accept IReadOnlyCollection to avoid cast at call site
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| public MessageService( | ||
| IOrganizationRepository organizationRepository, | ||
| IProjectRepository projectRepository, | ||
| IUserRepository userRepository, | ||
| IStackRepository stackRepository, | ||
| IEventRepository eventRepository, | ||
| ITokenRepository tokenRepository, | ||
| IWebHookRepository webHookRepository, | ||
| IConnectionMapping connectionMapping, | ||
| AppOptions options, | ||
| ILoggerFactory loggerFactory) |
There was a problem hiding this comment.
The MessageService.cs refactoring appears to be unrelated to the PR's stated purpose of optimizing CloseInactiveSessionsJob. While the refactoring looks reasonable (consolidating duplicate handler registration code into a generic method), it should ideally be in a separate PR for better change isolation and easier review/rollback. This makes the PR scope broader than described in the title and description.
Summary
The
CloseInactiveSessionsJobwas generating excessive Redis traffic — primarily individualGETcommands — causing noisy traces and unnecessary latency. This PR batches all cache reads into just 2 RedisMGETcalls per page instead of 100–300 sequentialGETcalls.Problem
For each open session (up to 100 per page), the job previously called
GetHeartbeatAsyncwhich issued 1–3 individual Redis GETs:GETfor the session-ID-based heartbeat timestampGETfor the user-identity-based heartbeat timestamp (fallback if Add Requirements Section #1 misses)GETfor the close flag (if a heartbeat was found)With 100 sessions per page, this produced 100–300 sequential Redis roundtrips per page, each generating its own trace span.
Solution
Replaced the per-session
GetHeartbeatAsync/GetLastHeartbeatActivityUtcAsyncmethods with a singleGetHeartbeatsBatchAsyncmethod that processes an entire page in 5 phases:GetAllAsync<DateTime>for all heartbeat timestampsGetAllAsync<bool>for all close flagsAdditional improvement
Dictionary<PersistentEvent, HeartbeatResult>keyed by domain objects with index-aligned arrays (HeartbeatResult?[]), avoiding reliance onGetHashCode/Equalsand reducing allocations.Impact
All existing session-closing semantics (duplicate detection, inactive period, close flags) are preserved.