Skip to content

Enhance crontab job execution with re-entry protection#1317

Open
adenchen123 wants to merge 3 commits intoSciSharp:masterfrom
adenchen123:GTP-12762
Open

Enhance crontab job execution with re-entry protection#1317
adenchen123 wants to merge 3 commits intoSciSharp:masterfrom
adenchen123:GTP-12762

Conversation

@adenchen123
Copy link
Copy Markdown
Contributor

No description provided.

@qodo-code-review
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Add distributed re-entry protection for crontab job execution

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add re-entry protection mechanism to prevent concurrent crontab job execution
• Introduce distributed locking using IDistributedLocker for task synchronization
• Add ReentryProtection boolean flag to CrontabItem model (enabled by default)
• Refactor execution logic with graceful fallback when lock acquisition fails
• Improve logging with detailed lock acquisition and failure messages
Diagram
flowchart LR
  A["CrontabItem<br/>+ ReentryProtection flag"] --> B["ExecuteTimeArrivedItemWithReentryProtection"]
  B --> C{"ReentryProtection<br/>enabled?"}
  C -->|Yes| D["Acquire Distributed Lock"]
  C -->|No| E["Execute directly"]
  D --> F{"Lock<br/>acquired?"}
  F -->|Yes| G["Execute Task"]
  F -->|No| H["Skip execution<br/>log warning"]
  D --> I{"Exception<br/>occurred?"}
  I -->|Before lock| J["Execute without lock"]
  I -->|After lock| K["Log warning"]
  G --> L["Release Lock"]
  E --> M["Complete"]
  J --> M
  K --> M
Loading

Grey Divider

File Changes

1. src/Infrastructure/BotSharp.Abstraction/Crontab/ICrontabService.cs ✨ Enhancement +1/-0

Add re-entry protected execution interface method

• Add new interface method ExecuteTimeArrivedItemWithReentryProtection for protected task
 execution

src/Infrastructure/BotSharp.Abstraction/Crontab/ICrontabService.cs


2. src/Infrastructure/BotSharp.Abstraction/Crontab/Models/CrontabItem.cs ✨ Enhancement +3/-0

Add re-entry protection flag to model

• Add ReentryProtection boolean property with default value of true
• Property controls whether distributed locking is applied during execution

src/Infrastructure/BotSharp.Abstraction/Crontab/Models/CrontabItem.cs


3. src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs ✨ Enhancement +59/-0

Implement distributed locking for re-entry protection

• Add IDistributedLocker dependency import for distributed locking
• Implement ExecuteTimeArrivedItemWithReentryProtection method with lock-based synchronization
• Extract execution logic into private ExecuteTimeArrivedItem method with error handling
• Add comprehensive logging for lock acquisition, failures, and exceptions
• Implement graceful fallback to execute without lock if Redis exceptions occur

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs


View more (1)
4. src/Infrastructure/BotSharp.OpenAPI/Controllers/Crontab/CrontabController.cs ✨ Enhancement +5/-16

Refactor controller to use protected execution service

• Update log message format for consistency and clarity
• Refactor ExecuteTimeArrivedItem to delegate to service's protected execution method
• Remove duplicate error handling logic (moved to service layer)
• Simplify controller method by removing try-catch block

src/Infrastructure/BotSharp.OpenAPI/Controllers/Crontab/CrontabController.cs


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown
Contributor

qodo-code-review bot commented Apr 1, 2026

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (1) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Lock key too coarse 🐞 Bug ≡ Correctness
Description
ExecuteTimeArrivedItemWithReentryProtection builds the distributed lock key from only
CrontabItem.Title, so different crontab items (e.g., different ConversationId/AgentId) sharing the
same Title will block each other and cause skipped executions.
Code

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs[R140-158]

+        var lockKey = $"crontab:execution:{item.Title}";
+        using var scope = _services.CreateScope();
+        var locker = scope.ServiceProvider.GetRequiredService<IDistributedLocker>();
+        var acquired = false;
+        var lockAcquired = false;
+
+        try
+        {
+            acquired = await locker.LockAsync(lockKey, async () =>
+            {
+                lockAcquired = true;
+                _logger.LogInformation("Crontab: {0}, Distributed lock acquired, beginning execution...", item.Title);
+                await ExecuteTimeArrivedItem(item);
+            }, timeout: 600);
+
+            if (!acquired)
+            {
+                _logger.LogWarning("Crontab: {0}, Failed to acquire distributed lock, task is still executing, skipping this occurrence to prevent re-entry.", item.Title);
+            }
Evidence
The lock key is based solely on Title, but persisted crontab items are uniquely upserted by
ConversationId (not Title), so multiple items can legally share the same Title across different
conversations/agents; those distinct items would contend on the same lock and be incorrectly
skipped.

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs[132-158]
src/Infrastructure/BotSharp.Abstraction/Crontab/Models/CrontabItem.cs[1-42]
src/Infrastructure/BotSharp.Core.Crontab/Functions/ScheduleTaskFn.cs[38-47]
src/Plugins/BotSharp.Plugin.MongoStorage/Repository/MongoRepository.Crontab.cs[8-25]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`ExecuteTimeArrivedItemWithReentryProtection` uses `crontab:execution:{item.Title}` as the distributed lock key. Since crontab items are persisted/upserted by `ConversationId` (and items also vary by `AgentId`/`UserId`), different jobs can share a title and incorrectly block each other, leading to skipped executions.
### Issue Context
- Crontab items in Mongo are upserted by `ConversationId`, not by `Title`, so `Title` is not a safe global unique identifier.
- The lock key should incorporate the job's real identity (at least `ConversationId`, and likely `AgentId` too).
### Fix Focus Areas
- src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs[132-158]
### Implementation notes
- Build the lock key from stable identifiers, e.g.:
- `crontab:execution:{item.TriggerType}:{item.AgentId}:{item.ConversationId}:{item.Title}`
- If some sources don’t have `ConversationId`, fall back to a safe placeholder like `global`.
- Keep the log message consistent with the new key (include key parts for debugging).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Reentry flag not persisted🐞 Bug ≡ Correctness
Description
CrontabItem.ReentryProtection is added to the domain model, but Mongo persistence mapping omits it,
so the value cannot round-trip and will silently revert to the default on load.
Code

src/Infrastructure/BotSharp.Abstraction/Crontab/Models/CrontabItem.cs[R35-37]

+    [JsonPropertyName("reentry_protection")]
+    public bool ReentryProtection { get; set; } = true;
+
Evidence
CrontabItem now includes ReentryProtection, but the Mongo document type and mapping methods do not
have or map this field, so persisted records cannot store the setting and loaded items will always
use the model default.

src/Infrastructure/BotSharp.Abstraction/Crontab/Models/CrontabItem.cs[32-42]
src/Plugins/BotSharp.Plugin.MongoStorage/Collections/CrontabItemDocument.cs[5-62]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`CrontabItem.ReentryProtection` was added to the domain model, but Mongo persistence (`CrontabItemDocument` + mapping methods) does not include this field, so disabling protection (or setting it explicitly) will not persist.
### Issue Context
`CrontabItemDocument.ToDomainModel` / `ToMongoModel` are manually mapping fields; any new domain property must be mapped explicitly.
### Fix Focus Areas
- src/Infrastructure/BotSharp.Abstraction/Crontab/Models/CrontabItem.cs[32-42]
- src/Plugins/BotSharp.Plugin.MongoStorage/Collections/CrontabItemDocument.cs[5-62]
### Implementation notes
- Add `public bool ReentryProtection { get; set; }` to `CrontabItemDocument`.
- Map it in both directions:
- `ToDomainModel`: `ReentryProtection = item.ReentryProtection`
- `ToMongoModel`: `ReentryProtection = item.ReentryProtection`
- If there are existing stored docs, ensure the default behavior is acceptable when the field is missing (Mongo default false unless handled); consider defaulting to `true` when absent if that matches intended behavior.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. ex.Message logged in crontab warnings 📘 Rule violation ⛨ Security
Description
The new distributed-lock error handling logs raw ex.Message, which can contain sensitive internal
details (e.g., Redis hostnames, stack fragments, configuration hints). This increases the risk of
leaking internal configuration through logs and should be replaced with sanitized/structured
exception logging.
Code

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs[R160-170]

+        catch (Exception ex)
+        {
+            if (!lockAcquired)
+            {
+                _logger.LogWarning("Crontab: {0}, Redis exception occurred before acquiring lock: {1}, executing without lock protection (re-entry protection disabled).", item.Title, ex.Message);
+                await ExecuteTimeArrivedItem(item);
+            }
+            else
+            {
+                _logger.LogWarning("Crontab: {0}, Redis exception occurred after lock acquired: {1}, task execution completed but lock release failed.", item.Title, ex.Message);
+            }
Evidence
PR Compliance ID 3 requires avoiding sensitive/internal configuration leakage in logs; the new code
logs ex.Message directly in warning logs during distributed-lock failures, potentially exposing
internal details.

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs[160-170]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`CrontabService.ExecuteTimeArrivedItemWithReentryProtection` logs raw `ex.Message` in warning logs. Exception messages from infrastructure providers (e.g., Redis) may contain sensitive internal configuration details and should not be logged verbatim.
## Issue Context
Compliance requires minimizing sensitive/internal configuration exposure in logs and using safer structured exception logging.
## Fix Focus Areas
- src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs[160-170]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Fire-and-forget exceptions unlogged 🐞 Bug ☼ Reliability
Description
SchedulingCrontab uses Task.Run without awaiting and the new ExecuteTimeArrivedItem wrapper no
longer catches/logs exceptions, so failures (e.g., DI resolution) can become unobserved task
exceptions with no application logging.
Code

src/Infrastructure/BotSharp.OpenAPI/Controllers/Crontab/CrontabController.cs[R63-66]

+                _logger.LogInformation($"Crontab: {item.Title}, One occurrence was matched, attempting to execute...");
               Task.Run(() => ExecuteTimeArrivedItem(item, _services));
               result.OccurrenceMatchedItems.Add(item.Title);
           }
Evidence
The scheduler starts background work via Task.Run and drops the returned Task. The wrapper method
now just resolves services and awaits execution without any try/catch, so exceptions before
CrontabService’s internal handling (e.g., CreateScope/GetRequiredService) will be unobserved and not
logged by this controller.

src/Infrastructure/BotSharp.OpenAPI/Controllers/Crontab/CrontabController.cs[59-66]
src/Infrastructure/BotSharp.OpenAPI/Controllers/Crontab/CrontabController.cs[87-92]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`SchedulingCrontab` fires `Task.Run(...)` and does not observe the returned `Task`. The refactored `ExecuteTimeArrivedItem` no longer wraps execution in `try/catch`, so exceptions can be lost (or only surface as unobserved task exceptions).
### Issue Context
Even if `ExecuteTimeArrivedItemWithReentryProtection` catches Redis/lock errors, exceptions can still occur before entering it (scope creation / DI resolution).
### Fix Focus Areas
- src/Infrastructure/BotSharp.OpenAPI/Controllers/Crontab/CrontabController.cs[59-66]
- src/Infrastructure/BotSharp.OpenAPI/Controllers/Crontab/CrontabController.cs[87-92]
### Implementation notes
- Prefer `_ = Task.Run(async () => { try { await ExecuteTimeArrivedItem(item, _services); } catch (Exception ex) { _logger.LogError(ex, "Crontab: {Title} background execution failed", item.Title); } });`
- Alternatively, avoid `Task.Run` and use a background queue/hosted service (if available in this codebase) so execution is tracked and failures are logged.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Lock TTL not configurable 🐞 Bug ☼ Reliability
Description
The distributed lock expiry is hard-coded to 600 seconds, so if a crontab execution exceeds that
duration the lock can expire and allow another instance to acquire it while the first is still
running, defeating re-entry protection for long-running jobs.
Code

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs[R148-154]

+            acquired = await locker.LockAsync(lockKey, async () =>
+            {
+                lockAcquired = true;
+                _logger.LogInformation("Crontab: {0}, Distributed lock acquired, beginning execution...", item.Title);
+                await ExecuteTimeArrivedItem(item);
+            }, timeout: 600);
+
Evidence
CrontabService passes timeout: 600 to the locker. DistributedLocker uses this value as the Redis
lock expiry (option.Expiry(timeout)), and the code does not configure any renewal/extension, so
the lock can lapse mid-execution if the action runs longer than the expiry.

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs[148-154]
src/Infrastructure/BotSharp.Core/Infrastructures/DistributedLocker.cs[20-46]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The lock expiry is currently hard-coded (`timeout: 600`). If a job runs longer than 10 minutes, the lock can expire and another execution can start concurrently.
### Issue Context
`DistributedLocker.LockAsync` sets Redis lock expiry via `option.Expiry(timeout)` and does not show any renewal.
### Fix Focus Areas
- src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs[148-154]
- src/Infrastructure/BotSharp.Core/Infrastructures/DistributedLocker.cs[20-46]
### Implementation notes
- Make the expiry configurable or derived from `CrontabItem` (e.g., `ExpireSeconds` or a new `LockExpirySeconds`).
- Ensure the expiry comfortably exceeds maximum expected execution time (or implement renewal if the locking library supports it).
- While touching this code, consider fixing the misleading log text in the Redis-failure fallback path (it currently claims “re-entry protection disabled” when it’s actually enabled but unavailable due to Redis failure).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment on lines +140 to +158
var lockKey = $"crontab:execution:{item.Title}";
using var scope = _services.CreateScope();
var locker = scope.ServiceProvider.GetRequiredService<IDistributedLocker>();
var acquired = false;
var lockAcquired = false;

try
{
acquired = await locker.LockAsync(lockKey, async () =>
{
lockAcquired = true;
_logger.LogInformation("Crontab: {0}, Distributed lock acquired, beginning execution...", item.Title);
await ExecuteTimeArrivedItem(item);
}, timeout: 600);

if (!acquired)
{
_logger.LogWarning("Crontab: {0}, Failed to acquire distributed lock, task is still executing, skipping this occurrence to prevent re-entry.", item.Title);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Lock key too coarse 🐞 Bug ≡ Correctness

ExecuteTimeArrivedItemWithReentryProtection builds the distributed lock key from only
CrontabItem.Title, so different crontab items (e.g., different ConversationId/AgentId) sharing the
same Title will block each other and cause skipped executions.
Agent Prompt
### Issue description
`ExecuteTimeArrivedItemWithReentryProtection` uses `crontab:execution:{item.Title}` as the distributed lock key. Since crontab items are persisted/upserted by `ConversationId` (and items also vary by `AgentId`/`UserId`), different jobs can share a title and incorrectly block each other, leading to skipped executions.

### Issue Context
- Crontab items in Mongo are upserted by `ConversationId`, not by `Title`, so `Title` is not a safe global unique identifier.
- The lock key should incorporate the job's real identity (at least `ConversationId`, and likely `AgentId` too).

### Fix Focus Areas
- src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabService.cs[132-158]

### Implementation notes
- Build the lock key from stable identifiers, e.g.:
  - `crontab:execution:{item.TriggerType}:{item.AgentId}:{item.ConversationId}:{item.Title}`
  - If some sources don’t have `ConversationId`, fall back to a safe placeholder like `global`.
- Keep the log message consistent with the new key (include key parts for debugging).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

aden.chen added 2 commits April 1, 2026 09:11
…te CrontabItemDocument to include TriggerType and ReentryProtection properties.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant