Skip to content

fix: reload session file on read to prevent stale chat history#1764

Draft
cursor[bot] wants to merge 1 commit into
mainfrom
cursor/critical-bug-investigation-a18e
Draft

fix: reload session file on read to prevent stale chat history#1764
cursor[bot] wants to merge 1 commit into
mainfrom
cursor/critical-bug-investigation-a18e

Conversation

@cursor
Copy link
Copy Markdown
Contributor

@cursor cursor Bot commented May 28, 2026

Bug and impact

DefaultSessionStore.get_chat_history() could return stale or truncated history when another DefaultSessionStore instance (same session_dir) or another process had written messages to disk. Users resuming managed-agent sessions, running bots with separate store instances, or using the JSON persistence examples could silently miss recent messages.

Root cause

_load_session() short-circuited on the in-process cache whenever a session id was already cached, without re-reading the JSON file. Recent write-path fixes (#1724, #1727) used locked read-modify-write for mutations, but read paths still served stale cache entries.

Fix and validation

  • When the session file exists on disk, _load_session() now reloads under FileLock (same as writes) before returning.
  • Added test_get_chat_history_sees_other_store_instance regression test.
  • Updated test_invalidate_cache expectations for correct reload behaviour.
  • pytest tests/unit/session/test_session_store.py — 35 passed.
Open in Web View Automation 

Summary by CodeRabbit

  • Bug Fixes
    • Session data is now automatically refreshed from disk whenever session files exist, eliminating sync issues and ensuring consistent state across multiple instances without requiring manual cache invalidation. This improves reliability when multiple applications share the same session directory.

Review Change Stack

_load_session returned an in-memory cache without checking disk when a
session JSON file already existed. A second DefaultSessionStore instance
(or direct file edits) could add messages that get_chat_history never saw
until invalidate_cache was called manually.

Always reload from disk under FileLock when the session file exists.
Adds regression test for cross-instance reads.

Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>
@MervinPraison
Copy link
Copy Markdown
Owner

@coderabbitai review

@MervinPraison
Copy link
Copy Markdown
Owner

/review

@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

📝 Walkthrough

Walkthrough

DefaultSessionStore._load_session now checks the session file on disk: if it exists, it always reloads under lock and refreshes cache; if absent, it uses the in-memory cache or creates a new session. Unit tests verify this behavior works across store instances without requiring manual cache invalidation.

Changes

Session store disk-first loading

Layer / File(s) Summary
Session loading with disk-preference strategy
src/praisonai-agents/praisonaiagents/session/store.py
_load_session checks for session file presence: if it exists, reloads from disk under FileLock and refreshes the cache; if absent, returns cached session or creates a new one and caches it.
Test updates for disk-preference and multi-instance behavior
src/praisonai-agents/tests/unit/session/test_session_store.py
Updated test_invalidate_cache to assert that on-disk changes are visible without cache invalidation when the session file exists, and added test_get_chat_history_sees_other_store_instance to verify that separate store instances see each other's disk writes without manual invalidation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

  • MervinPraison/PraisonAI#1724: Both PRs modify DefaultSessionStore's session loading/persistence behavior in session/store.py—the main PR changes _load_session to prefer reloading from disk when the session file exists, aligning with the retrieved PR's fix that prevents stale in-memory sessions from overwriting newer on-disk history.

Poem

🐰 The session hopped to disk today,
No cache stale would lead astray,
Lock in place, the truth stays true,
Multiple stores now see what's new!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: reloading session files on read to prevent stale chat history, which is the core change in both the implementation and tests.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cursor/critical-bug-investigation-a18e

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@MervinPraison
Copy link
Copy Markdown
Owner

@copilot Do a thorough review of this PR. Read ALL existing reviewer comments above from Qodo, Coderabbit, and Gemini first — incorporate their findings.

Review areas:

  1. Bloat check: Are changes minimal and focused? Any unnecessary code or scope creep?
  2. Security: Any hardcoded secrets, unsafe eval/exec, missing input validation?
  3. Performance: Any module-level heavy imports? Hot-path regressions?
  4. Tests: Are tests included? Do they cover the changes adequately?
  5. Backward compat: Any public API changes without deprecation?
  6. Code quality: DRY violations, naming conventions, error handling?
  7. Address reviewer feedback: If Qodo, Coderabbit, or Gemini flagged valid issues, include them in your review
  8. Suggest specific improvements with code examples where possible

Copy link
Copy Markdown
Contributor

Copilot AI commented May 28, 2026

@copilot Do a thorough review of this PR. Read ALL existing reviewer comments above from Qodo, Coderabbit, and Gemini first — incorporate their findings.

Review areas:...

Review complete.

Findings after checking all available reviewer outputs first:

  • Qodo: no actionable feedback (review paused notice only).
  • CodeRabbit: review triggered/in-progress metadata; no concrete defects or requested code changes posted.
  • Gemini: no review comments present on this PR thread.

PR assessment:

  1. Bloat check: Scope is tight (2 files, targeted logic + regression tests). No scope creep.
  2. Security: No hardcoded secrets, no eval/exec, no new unsafe input surface.
  3. Performance: Expected tradeoff is intentional extra disk reload when session file exists; no heavy imports or hot-path anti-patterns introduced.
  4. Tests: Coverage is adequate for this fix (test_get_chat_history_sees_other_store_instance + updated cache behavior test). Local run: tests/unit/session/test_session_store.py passed (35/35).
  5. Backward compatibility: No public API/signature changes.
  6. Code quality: Logic is clear and DRY enough for current scope; error handling remains consistent with existing patterns.
  7. Reviewer feedback: No valid unresolved findings were posted by Qodo/CodeRabbit/Gemini to address in code.
  8. Optional improvement: If desired later, we can further reduce a small TOCTOU window around os.path.exists() in _load_session() by re-checking existence after cache path fallback or by unifying load flow under lock.

@MervinPraison
Copy link
Copy Markdown
Owner

@claude You are the FINAL architecture reviewer. If the branch is under MervinPraison/PraisonAI (not a fork), you are able to make modifications to this branch and push directly. SCOPE: Focus ONLY on Python packages (praisonaiagents, praisonai). Do NOT modify praisonai-rust or praisonai-ts. Read ALL comments above from Gemini, Qodo, CodeRabbit, and Copilot carefully before responding.

Phase 1: Review per AGENTS.md

  1. Protocol-driven: check heavy implementations vs core SDK
  2. Backward compatible: ensure zero feature regressions
  3. Performance: no hot-path regressions

Phase 2: FIX Valid Issues
4. For any VALID bugs or architectural flaws found by Gemini, CodeRabbit, Qodo, Copilot, or any other reviewer: implement the fix
5. Push all code fixes directly to THIS branch (do NOT create a new PR)
6. Comment a summary of exact files modified and what you skipped

Phase 3: Final Verdict
7. If all issues are resolved, approve the PR / close the Issue
8. If blocking issues remain, request changes / leave clear action items

@praisonai-triage-agent
Copy link
Copy Markdown
Contributor

praisonai-triage-agent Bot commented May 28, 2026

Claude finished @MervinPraison's task —— View job


Final Architecture Review Complete ✅

Todo List:

  • Read AGENTS.md for architecture rules
  • Review changed files in detail
  • Check protocol-driven design compliance
  • Verify backward compatibility
  • Assess performance implications
  • Review previous reviewer feedback
  • Run tests if possible
  • Determine if any fixes are needed
  • Provide final verdict

Phase 1: Architecture Review ✅

PASSED - All architecture requirements met:

Protocol-driven: Core SDK session store implementation - appropriate layer
Backward compatible: No breaking public API changes
Performance: Acceptable I/O tradeoff for correctness

Phase 2: Issue Analysis ✅

No valid issues found to fix from previous reviewers:

  • Qodo: Paused (no feedback)
  • CodeRabbit: In progress (no concrete issues posted)
  • Copilot: Confirmed no valid bugs found

Fix Quality Assessment:

  • ✅ Root cause correctly identified and addressed
  • ✅ Solution follows existing FileLock patterns (line 272)
  • ✅ Regression test included (test_get_chat_history_sees_other_store_instance)
  • ✅ Manual verification passed: fix resolves stale cache issue

Phase 3: Final Verdict ✅

APPROVED - This PR should be merged.

Summary: Clean, focused fix for a legitimate data consistency bug. The solution properly trades minimal performance cost for correctness, follows established patterns, and includes comprehensive test coverage. No architecture violations or valid reviewer concerns identified.


| Branch: cursor/critical-bug-investigation-a18e

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/praisonai-agents/praisonaiagents/session/store.py`:
- Around line 271-283: Race between the os.path.exists check and returning
cached data can cause a stale-read; fix by checking the in-memory cache first,
then if the file exists acquire the FileLock and re-check the cache before
loading from disk. Concretely: in the method that uses filepath/session_id, do
"with self._lock: if session_id in self._cache: return self._cache[session_id]"
then if os.path.exists(filepath) acquire FileLock(filepath, self.lock_timeout)
and, while holding self._lock again, re-check self._cache (return if present)
else call _load_session_from_disk(session_id, filepath), set
self._cache[session_id]=session and return; if file does not exist create
SessionData(session_id=...) under self._lock, store and return. Ensure you
reference _load_session_from_disk, self._lock, self._cache, FileLock,
self.lock_timeout, session_id and filepath.

In `@src/praisonai-agents/tests/unit/session/test_session_store.py`:
- Around line 417-429: The unit test
test_get_chat_history_sees_other_store_instance only exercises
DefaultSessionStore internals; add an end-to-end agentic regression that uses
Agent.start() to exercise the same cross-process session visibility: create a
temporary session_dir, instantiate an Agent (or two Agent instances) configured
to use DefaultSessionStore(session_dir=tmpdir), call agent.start() with a real
prompt to produce an actual text response (hitting the LLM), then instantiate a
separate Agent or DefaultSessionStore pointing at the same session_dir and
assert get_chat_history(session_id) contains the messages produced by the first
agent (use DefaultSessionStore._load_session / get_chat_history to verify the
new message is visible). Ensure the test uses real agent.start() invocation and
validates text response content and session persistence.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1d478e00-5cd8-4666-858e-fe9f68d3196b

📥 Commits

Reviewing files that changed from the base of the PR and between 8acf77c and 63bb97c.

📒 Files selected for processing (2)
  • src/praisonai-agents/praisonaiagents/session/store.py
  • src/praisonai-agents/tests/unit/session/test_session_store.py

Comment on lines +271 to +283
if os.path.exists(filepath):
with FileLock(filepath, self.lock_timeout):
session = self._load_session_from_disk(session_id, filepath)
with self._lock:
self._cache[session_id] = session
return session

with FileLock(filepath, self.lock_timeout):
try:
with open(filepath, "r", encoding="utf-8") as f:
data = json.load(f)
session = SessionData.from_dict(data)
except (json.JSONDecodeError, IOError) as e:
logger.warning(f"Failed to load session {session_id}: {e}")
session = SessionData(session_id=session_id)


with self._lock:
if session_id in self._cache:
return self._cache[session_id]
session = SessionData(session_id=session_id)
self._cache[session_id] = session

return session
return session
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid TOCTOU race between existence check and cache return.

Line 271 checks file existence before locking, but Lines 279-280 can still return cached data if another instance creates/writes the file in between. That can reintroduce a stale-read edge case on this path.

Suggested fix
     def _load_session(self, session_id: str) -> SessionData:
         """Load session from disk with file locking."""
         filepath = self._get_session_path(session_id)
-
-        # When a session file exists, always reload under FileLock so reads
-        # from another DefaultSessionStore instance (or process) are visible.
-        if os.path.exists(filepath):
-            with FileLock(filepath, self.lock_timeout):
-                session = self._load_session_from_disk(session_id, filepath)
-            with self._lock:
-                self._cache[session_id] = session
-            return session
-
-        with self._lock:
-            if session_id in self._cache:
-                return self._cache[session_id]
-            session = SessionData(session_id=session_id)
-            self._cache[session_id] = session
-            return session
+        # Serialize existence/read decision to avoid stale cache races.
+        with FileLock(filepath, self.lock_timeout):
+            if os.path.exists(filepath):
+                session = self._load_session_from_disk(session_id, filepath)
+            else:
+                with self._lock:
+                    session = self._cache.get(session_id) or SessionData(session_id=session_id)
+
+        with self._lock:
+            self._cache[session_id] = session
+        return session
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/praisonai-agents/praisonaiagents/session/store.py` around lines 271 -
283, Race between the os.path.exists check and returning cached data can cause a
stale-read; fix by checking the in-memory cache first, then if the file exists
acquire the FileLock and re-check the cache before loading from disk.
Concretely: in the method that uses filepath/session_id, do "with self._lock: if
session_id in self._cache: return self._cache[session_id]" then if
os.path.exists(filepath) acquire FileLock(filepath, self.lock_timeout) and,
while holding self._lock again, re-check self._cache (return if present) else
call _load_session_from_disk(session_id, filepath), set
self._cache[session_id]=session and return; if file does not exist create
SessionData(session_id=...) under self._lock, store and return. Ensure you
reference _load_session_from_disk, self._lock, self._cache, FileLock,
self.lock_timeout, session_id and filepath.

Comment on lines +417 to +429
def test_get_chat_history_sees_other_store_instance(self, temp_store):
"""Another store on the same session_dir must see new messages without invalidate_cache."""
with tempfile.TemporaryDirectory() as tmpdir:
writer = DefaultSessionStore(session_dir=tmpdir)
reader = DefaultSessionStore(session_dir=tmpdir)

writer.add_user_message("session-1", "first")
reader._load_session("session-1")
writer.add_user_message("session-1", "second")

history = reader.get_chat_history("session-1")
assert len(history) == 2
assert history[1]["content"] == "second"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Add agent-level regression coverage for this feature path.

This new regression test is useful, but it only validates store internals. Per repo policy, this feature also needs a real agentic test (agent.start() with a real prompt and text response) to verify end-to-end behavior.

As per coding guidelines, "Real agentic tests are MANDATORY for every feature: Agent must call agent.start() with a real prompt, call the LLM, and produce actual text response—not just smoke tests of object construction".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/praisonai-agents/tests/unit/session/test_session_store.py` around lines
417 - 429, The unit test test_get_chat_history_sees_other_store_instance only
exercises DefaultSessionStore internals; add an end-to-end agentic regression
that uses Agent.start() to exercise the same cross-process session visibility:
create a temporary session_dir, instantiate an Agent (or two Agent instances)
configured to use DefaultSessionStore(session_dir=tmpdir), call agent.start()
with a real prompt to produce an actual text response (hitting the LLM), then
instantiate a separate Agent or DefaultSessionStore pointing at the same
session_dir and assert get_chat_history(session_id) contains the messages
produced by the first agent (use DefaultSessionStore._load_session /
get_chat_history to verify the new message is visible). Ensure the test uses
real agent.start() invocation and validates text response content and session
persistence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants