fix: thread-safety in SQLiteSession and AdvancedSQLiteSession with shared lock (Finish) by KirobotDev · Pull Request #2831 · openai/openai-agents-python

KirobotDev · 2026-04-02T15:40:53Z

Thanks for the detailed feedback! I've updated the PR to address all the concerns:

Changes Made

Shared lock moved to base class (sqlite_session.py)

Added self._file_db_lock = threading.Lock() in SQLiteSession.__init__()
Updated all 4 base methods to use the shared lock:
- get_items()
- add_items()
- pop_item()
- clear_session()

AdvancedSQLiteSession now inherits the lock

Removed duplicate lock initialization
Now inherits _file_db_lock from base class
All 7 methods updated to use self._file_db_lock instead of threading.Lock()

Regression tests added

test_concurrent_file_db_access_regression(): Tests 30 concurrent writes from 3 tasks
test_concurrent_mixed_operations_file_db(): Tests interleaved add/get operations

Both tests use file-backed databases (not in-memory) to verify thread safety.

Problem Solved
Previously, threading.Lock() created fresh lock instances per method call → no actual locking. Now both base and derived classes use the same per-instance lock → proper thread safety for concurrent file DB access.

Ready for review!

seratch

Thanks for sharing this. I don't think this patch fully closes the thread-safety gap yet.

AdvancedSQLiteSession.add_items() calls SQLiteSession.add_items() first, and SQLiteSession still uses a fresh threading.Lock() for file-backed DBs. So concurrent writes can still interleave before the advanced-session-specific code takes _file_db_lock.

Could we move the shared file-DB lock to the base SQLiteSession path (or otherwise make both base and advanced code use the same per-instance lock), and add a regression test that exercises concurrent AdvancedSQLiteSession access against a file-backed DB?

KirobotDev · 2026-04-02T23:46:51Z

Thanks for the detailed feedback! You're absolutely right - my fix only addressed the symptom in AdvancedSQLiteSession while leaving the root cause in the base SQLiteSession untouched.

Issues identified:
SQLiteSession.add_items() uses threading.Lock() directly (line 185 in sqlite_session.py)
SQLiteSession.get_items() uses threading.Lock() directly (line 131)
SQLiteSession.pop_item() uses threading.Lock() directly (line 226)
SQLiteSession.clear_session() uses threading.Lock() directly (line 263)

This means concurrent calls to base methods like add_items() can still interleave with AdvancedSQLiteSession operations since they use different lock instances.

Proposed approach:
Move the self._file_db_lock to SQLiteSession.init() so both classes share the same per-instance lock. This requires:
Adding self._file_db_lock = threading.Lock() in base class
Updating all base class methods to use it
AdvancedSQLiteSession will inherit and use the same lock

I'll update the PR to :
Move lock initialization to base SQLiteSession
Update all base class methods using the pattern
Remove duplicate lock creation from AdvancedSQLiteSession
Add a regression test with concurrent file-backed DB access

Should I proceed with this approach, or would you prefer a different solution making the lock injectable or using asyncio.Lock throughout?

- Add shared _file_db_lock in base SQLiteSession class - Replace fresh threading.Lock() calls with shared instance lock - Ensure AdvancedSQLiteSession inherits the same lock - Add regression tests for concurrent file DB access

seratch · 2026-04-04T02:46:55Z

Thanks again for iterating on this. I took a closer look from the current session implementation and I think there’s a slightly different fix shape that addresses a more important race.

The shared per-instance file lock in this PR is an improvement over creating a fresh threading.Lock() in each method call, but AdvancedSQLiteSession.add_items() still does two separate phases: super().add_items(items) first, then _add_structure_metadata(items) in a second to_thread() call and a second lock acquisition. That means another concurrent writer can still insert rows for the same session between those two phases, and then _add_structure_metadata() may resolve the wrong "latest N" message IDs when it builds message_structure.

So I think the stronger fix is:

put a reusable lock/context primitive in the base SQLiteSession and make file-backed sessions that point at the same DB path share the same process-local RLock (not just a per-instance lock);
keep using threading.RLock rather than switching to asyncio.Lock for now, because all sqlite3 work is still executed in worker threads via asyncio.to_thread();
in AdvancedSQLiteSession.add_items(), perform the base message insert and structure metadata insert in one lock-protected critical section, so the selected message IDs and the inserted metadata stay aligned;
route the remaining AdvancedSQLiteSession DB reads/writes through the same lock helper and add a regression test that checks concurrent file-backed add_items() keeps agent_messages and message_structure in a 1:1 correspondence.

I have an alternative at #2843. Please feel free to share feedback if you have any.

github-actions bot added bug Something isn't working feature:extensions labels Apr 2, 2026

seratch requested changes Apr 2, 2026

View reviewed changes

seratch marked this pull request as draft April 2, 2026 23:37

seratch changed the title ~~Fix: Use instance-level threading.Lock for disk-based DBs in AdvancedSQLiteSession~~ fix: use instance-level threading.Lock for disk-based DBs in *SQLiteSession Apr 2, 2026

KirobotDev requested a review from seratch April 3, 2026 01:25

KirobotDev marked this pull request as ready for review April 3, 2026 01:26

Fix: shared threading.Lock for file-backed SQLite sessions

cac8b21

- Add shared _file_db_lock in base SQLiteSession class - Replace fresh threading.Lock() calls with shared instance lock - Ensure AdvancedSQLiteSession inherits the same lock - Add regression tests for concurrent file DB access

KirobotDev force-pushed the patch-1 branch from 8833b08 to cac8b21 Compare April 3, 2026 01:30

github-actions bot added the feature:sessions label Apr 3, 2026

seratch marked this pull request as draft April 3, 2026 05:59

KirobotDev marked this pull request as ready for review April 3, 2026 15:05

KirobotDev changed the title ~~fix: use instance-level threading.Lock for disk-based DBs in *SQLiteSession~~ fix: thread-safety in SQLiteSession and AdvancedSQLiteSession with shared lock Apr 3, 2026

KirobotDev changed the title ~~fix: thread-safety in SQLiteSession and AdvancedSQLiteSession with shared lock~~ fix: thread-safety in SQLiteSession and AdvancedSQLiteSession with shared lock (Finish) Apr 3, 2026

seratch closed this Apr 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: thread-safety in SQLiteSession and AdvancedSQLiteSession with shared lock (Finish)#2831

fix: thread-safety in SQLiteSession and AdvancedSQLiteSession with shared lock (Finish)#2831
KirobotDev wants to merge 1 commit intoopenai:mainfrom
KirobotDev:patch-1

KirobotDev commented Apr 2, 2026 •

edited

Loading

Uh oh!

seratch left a comment

Uh oh!

KirobotDev commented Apr 2, 2026

Uh oh!

seratch commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

KirobotDev commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seratch left a comment

Choose a reason for hiding this comment

Uh oh!

KirobotDev commented Apr 2, 2026

Uh oh!

seratch commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KirobotDev commented Apr 2, 2026 •

edited

Loading