Skip to content

Cleanup#1578

Closed
neolynx wants to merge 12 commits into
fix/task-racefrom
cleanup
Closed

Cleanup#1578
neolynx wants to merge 12 commits into
fix/task-racefrom
cleanup

Conversation

@neolynx

@neolynx neolynx commented Jun 7, 2026

Copy link
Copy Markdown
Member

Description of the Change

cleanup docs, Makefile

neolynx added 12 commits May 23, 2026 13:54
apiPublishRepoOrSnapshot appended published.Key() to resources inside
the task closure, after maybeRunTaskInBackground had already been called.
The task's locked-resource set is fixed at submission time, so that append
had no effect — the published repo key was never registered as a resource.

Two concurrent POST /api/publish/{prefix} requests for the same
prefix/distribution therefore did not conflict in the task queue: both
ran in parallel, each loaded an empty PublishedRepoCollection from the DB,
both passed CheckDuplicate, and the second Add silently overwrote the first.

Fix: compute the published repo key ("U{storagePrefix}>>{distribution}")
from the already-known storage/prefix/distribution values and append it to
resources before calling maybeRunTaskInBackground, so concurrent creates
for the same destination are serialised by the task queue.  The now-dead
append inside the closure is removed.
This fixes a flaw in async apis, which loaded the published repo from the
DB and mutated it outside the task closure, before the task lock was acquired.

* perform collection.LoadComplete inside maybeRunTaskInBackground
* have tasks use a fresh copy of taskCollectionFactory, taskCollection
…be pre-registered

When b.Distribution is empty, the pre-registered resource key
U<storage>:<prefix>>><distribution> cannot be constructed, so
concurrent POST requests to the same prefix are not serialized by the
task queue.  Add a log warning so operators are aware of the gap.
Concurrent tasks were not properly locking their resources, leading to
inconsistent published indexes:

  Task A: apiPublishUpdateSwitch loads published, reads source repo/snapshot
  Request B: modifies same source repo or snapshot (add/remove packages, etc)
  Task A: Update() + Publish() reads stale/modified source -> inconsistent
           published index, or partial write if source deleted mid-task.

This changes introduces resource locking for publishing:
* SourceLocalRepo: iterate published.Sources (component -> source UUID), look up each local repo via localRepoCollection.ByUUID and append
string(repo.Key()) to resources
* SourceSnapshot: iterate b.Snapshots,look up each snapshot via snapshotCollection.ByName and append
string(snapshot.ResourceKey()) to resources.
This fixes a flaw in async apis, which loaded data from the
DB and mutated it outside the task closure, before the task lock was acquired.

    * perform collection.LoadComplete inside maybeRunTaskInBackground
    * have tasks use a fresh copy of taskCollectionFactory, taskCollection
This fixes a flaw in async apis, which loaded data from the
DB and mutated it outside the task closure, before the task lock was acquired.

 * perform collection.LoadComplete inside maybeRunTaskInBackground
 * have tasks use a fresh copy of taskCollectionFactory, taskCollection
This fixes a flaw in async apis, which loaded data from the
DB and mutated it outside the task closure, before the task lock was acquired.

 * perform collection.LoadComplete inside maybeRunTaskInBackground
 * have tasks use a fresh copy of taskCollectionFactory, taskCollection
Race condition iexisted where task State, err, and processReturnValue fields
were written by consumer goroutine and read by concurrent accessors without
proper synchronization, causing torn reads and data races.

Implemented single-lock model with optimal lock scope:

- Removed 8 accessor methods (direct field access is simpler)
- Lock only during brief state transitions (IDLE→RUNNING, RUNNING→SUCCEEDED/FAILED)
- Release lock during task.process() execution to enable full concurrency
- Readers hold list.Lock() only during atomic struct copy
- Moved State = RUNNING before goroutine spawn for clearer semantics
- task/list.go: RunTaskInBackground() copies *task before unlock,
    returns the pre-made copy instead of dereferencing after unlock
* revert mutex on LinkFromPool
* add tests
@codecov

codecov Bot commented Jun 7, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 75.62077% with 108 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.31%. Comparing base (8dc61cf) to head (8d6215f).
⚠️ Report is 11 commits behind head on fix/task-race.

Files with missing lines Patch % Lines
api/publish.go 74.03% 32 Missing and 22 partials ⚠️
api/repos.go 74.48% 14 Missing and 11 partials ⚠️
api/snapshot.go 79.01% 9 Missing and 8 partials ⚠️
api/mirror.go 70.00% 6 Missing and 3 partials ⚠️
task/list.go 87.50% 3 Missing ⚠️
Additional details and impacted files
@@                Coverage Diff                @@
##           fix/task-race    #1578      +/-   ##
=================================================
+ Coverage          77.22%   77.31%   +0.09%     
=================================================
  Files                161      161              
  Lines              15085    15243     +158     
=================================================
+ Hits               11649    11785     +136     
+ Misses              2292     2291       -1     
- Partials            1144     1167      +23     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant