fix(T11552): collision-safe brain_decisions.id — decision-store works under concurrent writes (P1 integrity)#901
Merged
Conversation
… under concurrent writes (P1 integrity) Root cause: storeDecision allocated the sequential Dnnn id via a MAX(id)+1 read in application code (nextDecisionId, async with await boundaries) and INSERTed later. Two agents writing in the same instant both read the same MAX(id), both proposed the same id, and the second INSERT collided on the id PRIMARY KEY — losing the decision and forcing CLEO_OWNER_OVERRIDE. Fix: compute the next id with a MAX(...)+1 subquery INSIDE the INSERT statement (BrainDataAccessor.addDecisionWithSequentialId). node:sqlite runs it synchronously/atomically so concurrent callers cannot interleave between read and write. Also numeric (correct past D999→D1000) and legacy-id tolerant (GLOB 'D[0-9]*'). Bounded retry kept as cross-process defense; never INSERT OR IGNORE so no silent drop. Regression test: 30 concurrent storeDecision inserts all persist with distinct ids + injected cross-process collision recovery. Both fail on the pre-fix code with the exact UNIQUE constraint failed: brain_decisions.id. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a P1 Living-Brain integrity defect (T11552):
cleo memory decision-storehit a repeatableUNIQUE constraint failed: brain_decisions.idunder concurrent writes, blocking the BRAIN decision-store and forcing theCLEO_OWNER_OVERRIDEfallback. Two design agents independently could not anchor a decision atom on 2026-06-01.Root cause (proven)
storeDecisionallocated the sequentialDnnnid with aMAX(id)+1read in application code (nextDecisionId— anasyncfunction withawaitboundaries) and only INSERTed the row later. The full path:cleo memory decision-store→decision.storeop →engine-compat.ts→storeDecision()→nextDecisionId()→INSERT INTO brain_decisions(id PRIMARY KEY).Two agents writing in the same instant both read the same
MAX(id)(e.g. both seeD042), both computeD043, and the second INSERT collides on theidprimary key. Because concurrent writers always read the same MAX, the collision is deterministic and repeatable — exactly the observed symptom.Fix (root-cause, not band-aid)
The next id is now computed by a
MAX(CAST(substr(id,2) AS INTEGER)) + 1subquery evaluated inside the INSERT statement (BrainDataAccessor.addDecisionWithSequentialId). node:sqlite executes the statement synchronously and atomically, so the id read and the row write are one indivisible operation that concurrent async callers cannot interleave.awaitboundary splits read from write.D999 → D1000(the prior lexicalORDER BY idregressed there).GLOB 'D[0-9]*'ignores foreign id shapes likeD-abcd.INSERT OR IGNORE(the exodus anti-pattern). A bounded retry remains as defense-in-depth for the genuine cross-process case.Proof
packages/core/src/memory/__tests__/decisions.test.ts): 30 concurrentstoreDecisioncalls all persist with distinct ids (a fan-out that exhausts a naive read-then-write retry budget), plus an injected cross-process PRIMARY KEY collision the retry loop recovers from. Both tests fail against pre-fixorigin/mainwith the exactUNIQUE constraint failed: brain_decisions.idsymptom (verified by reverting the source fix).cleo memory decision-storeconcurrently against a migrated brain.db all succeed with distinct ids (D002/D003/D004), no override.Scope
Only
packages/core/src/memory/decisions.ts,packages/core/src/store/memory-accessor.ts, and the test. Does NOT touch the conduit layer (separate L3 agent owns it). Built onorigin/mainpost-#899 (E6-L2 brain rewrite).Note for follow-up (out of scope)
A separate concurrency defect was observed: two processes both running first-run
CREATE TABLE brain_attentionon a brand-new empty brain.db race on schema bootstrap. This is the first-run-migration path, not the decision-id bug, and is out of T11552's scope.🤖 Generated with Claude Code