Fix concurrent API race conditions#1574
Open
neolynx wants to merge 11 commits into
Open
Conversation
7fd20e3 to
154615b
Compare
3 tasks
1609873 to
ddcdc79
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1574 +/- ##
==========================================
+ Coverage 77.22% 77.30% +0.08%
==========================================
Files 161 161
Lines 15085 15248 +163
==========================================
+ Hits 11649 11788 +139
Misses 2292 2292
- Partials 1144 1168 +24 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
b88e1f3 to
84fd844
Compare
9c82a70 to
cdc5610
Compare
resource locks need to be before the background task. creating same publish endpoint at the same time is unlikely...
This fixes a flaw in async apis, which loaded the published repo from the DB and mutated it outside the task closure, before the task lock was acquired. * perform collection.LoadComplete inside maybeRunTaskInBackground * have tasks use a fresh copy of taskCollectionFactory, taskCollection
Concurrent tasks were not properly locking their resources, leading to
inconsistent published indexes:
Task A: apiPublishUpdateSwitch loads published, reads source repo/snapshot
Request B: modifies same source repo or snapshot (add/remove packages, etc)
Task A: Update() + Publish() reads stale/modified source -> inconsistent
published index, or partial write if source deleted mid-task.
This changes introduces resource locking for publishing:
* SourceLocalRepo: iterate published.Sources (component -> source UUID), look up each local repo via localRepoCollection.ByUUID and append
string(repo.Key()) to resources
* SourceSnapshot: iterate b.Snapshots,look up each snapshot via snapshotCollection.ByName and append
string(snapshot.ResourceKey()) to resources.
This fixes a flaw in async apis, which loaded data from the
DB and mutated it outside the task closure, before the task lock was acquired.
* perform collection.LoadComplete inside maybeRunTaskInBackground
* have tasks use a fresh copy of taskCollectionFactory, taskCollection
This fixes a flaw in async apis, which loaded data from the DB and mutated it outside the task closure, before the task lock was acquired. * perform collection.LoadComplete inside maybeRunTaskInBackground * have tasks use a fresh copy of taskCollectionFactory, taskCollection
This fixes a flaw in async apis, which loaded data from the DB and mutated it outside the task closure, before the task lock was acquired. * perform collection.LoadComplete inside maybeRunTaskInBackground * have tasks use a fresh copy of taskCollectionFactory, taskCollection
Race condition iexisted where task State, err, and processReturnValue fields
were written by consumer goroutine and read by concurrent accessors without
proper synchronization, causing torn reads and data races.
Implemented single-lock model with optimal lock scope:
- Removed 8 accessor methods (direct field access is simpler)
- Lock only during brief state transitions (IDLE→RUNNING, RUNNING→SUCCEEDED/FAILED)
- Release lock during task.process() execution to enable full concurrency
- Readers hold list.Lock() only during atomic struct copy
- Moved State = RUNNING before goroutine spawn for clearer semantics
- task/list.go: RunTaskInBackground() copies *task before unlock,
returns the pre-made copy instead of dereferencing after unlock
* revert mutex on LinkFromPool * add tests
### api/snapshot.go — fix apiSnapshotsCreate The function was building the new snapshot's ref list only from b.PackageRefs, completely ignoring SourceSnapshots package contents (they were stored as provenance metadata only). The fix mirrors the approach from apiSnapshotsMerge: 1. Start with source snapshots: merge all freshSources[i].RefList() together using Merge(overrideMatching=true) 2. Layer explicit PackageRefs on top: only enter the package-loading loop if b.PackageRefs is non-empty, then merge the result into refList 3. Pass the combined refList to NewSnapshotFromRefList This means an empty snapshot (SourceSnapshots: [], PackageRefs: []) still correctly produces an empty ref list, single-source and multi-source snapshot-of-snapshot cases are now handled, and PackageRefs can still augment or override on top of the merged sources. ### system/t12_api/publish.py — fix PublishSwitchAPITestSnapshot - Removed the # FIXME comment - Changed check_not_exists → check_exists for pyspi-0.6.1-1.3.stripped.dsc after the publish-switch to snapshot2, which is now the correct expectation since snapshot2 inherits all packages from snapshot1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1570, #1467, #1449, #1376, #1305, #1196, #1163, #1125
Description of the Change
This fixes several race conditions when running API calls concurrently.
Checklist