Skip to content

LocalStore: Implement addMultipleToStore()#480

Merged
edolstra merged 10 commits into
mainfrom
eelcodolstra/nix-408
Jun 5, 2026
Merged

LocalStore: Implement addMultipleToStore()#480
edolstra merged 10 commits into
mainfrom
eelcodolstra/nix-408

Conversation

@edolstra
Copy link
Copy Markdown
Collaborator

@edolstra edolstra commented Jun 2, 2026

Motivation

This speeds up nix copy into a local store.

Previously copyPaths() relied on the default addMultipleToStore(), which walks the closure in topological order calling addToStore() per path. This means that we can't start copying a NAR into the store until all its dependencies have been added, in order to preserve the closure variant. This may waste available network/CPU capacity.

So now we extract the NARs into the store first before registering the store paths as valid. Specifically:

  1. We acquire locks on all store paths in sorted order (to avoid deadlocks).

  2. We extract the NARs in order of descending NAR size (to minimize the makespan).

  3. We register all new paths as valid in a single transaction.

This speeds up

# nix copy --from https://cache.nixos.org --to /tmp/nix --no-check-sigs /nix/store/8xyk1qxxjfb8cm62g61yc59phwha92w7-kdenlive-25.08.3

from ~8.8s to ~5.4s. (Of course this depends a lot on network speed, I/O, CPU etc.)

Note: currently this doesn't benefit copying into a local store via the Nix daemon. (See RemoteStore::addMultipleToStore() which sends NARs in topological order.) It also doesn't yet benefit substitution, since it currently substitutes one path at a time. (It should just rely on copyPaths() to fetch entire missing closures.)

Context

Summary by CodeRabbit

  • Refactor
    • Add bulk parallel installs for faster multi-path additions.
    • Consolidated restore and validation flow for more reliable installs and reuse of already-unpacked content when safe.
  • Bug Fixes
    • Cleanup of stale temporary unpack markers to prevent reuse of corrupted content.
    • Stronger ownership, locking and signature checks to reduce invalid installs.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 37608690-1d23-42c6-b133-124defc42dcf

📥 Commits

Reviewing files that changed from the base of the PR and between 13bc576 and 3d720de.

📒 Files selected for processing (1)
  • src/libstore/local-store.cc
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/libstore/local-store.cc

📝 Walkthrough

Walkthrough

Adds LocalStore::addMultipleToStore() for parallel import of multiple NARs, introduces doAddToStore() to centralize per-path restore/verification/canonicalise/optimise/fsync, refactors addToStore() to call the helper, and makes GC remove stale .unpacked markers before deleting directories.

Changes

Bulk add-to-store with parallel extraction

Layer / File(s) Summary
Public interface and doAddToStore contract
src/libstore/include/nix/store/local-store.hh
Declares the addMultipleToStore() override and private doAddToStore() helper plus the .unpacked marker helper.
Store helper and ownership checks
src/libstore/local-store.cc
Adds threading/includes and an ownership-check helper; implements doAddToStore() to restore NARs with NAR/CA verification, canonicalisation, optimisation, and optional syncing.
Refactor addToStore to use helper
src/libstore/local-store.cc
Refactors addToStore() to delegate restore/verification to doAddToStore().
Parallel multi-path bulk import
src/libstore/local-store.cc
Implements addMultipleToStore() with optional signature validation, staged planning, lock acquisition, parallel NAR extraction via ThreadPool (tasks sorted by NAR size), conditional reuse of .unpacked content, clearing info.ultimate, single-step registerValidPaths(), and .unpacked marker cleanup.
GC unpacked-marker removal
src/libstore/gc.cc
Ensure collectGarbage() removes leftover .unpacked markers before deleting store directories.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant LocalStore
  participant ThreadPool
  participant Source
  participant Registrar

  Client->>LocalStore: addMultipleToStore(PathsSource, Activity, RepairFlag, CheckSigsFlag)
  LocalStore->>LocalStore: validate signatures (optional), plan work, lock targets
  LocalStore->>ThreadPool: submit doAddToStore tasks (sorted by NAR size)
  ThreadPool->>Source: extract/restore NAR for path
  Source-->>ThreadPool: restored contents + NAR hash/size
  ThreadPool->>LocalStore: doAddToStore completes (canonicalise/optimise/fsync)
  ThreadPool-->>LocalStore: signal task complete
  LocalStore->>Registrar: registerValidPaths(all new paths)
  LocalStore->>LocalStore: delete `.unpacked` markers
  LocalStore-->>Client: return
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • cole-h

Poem

🐰 I hopped through NARs with threads in tow,
Restoring each path in a speedy flow.
Hashes checked neat,
Markers cleared from the street,
Now the store hums happy — off we go!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'LocalStore: Implement addMultipleToStore()' directly and concisely describes the main change—adding a new method implementation to LocalStore—which aligns with the PR's primary objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch eelcodolstra/nix-408

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Infer (1.2.0)
src/libstore/local-store.cc

src/libstore/local-store.cc:1:10: fatal error: 'nix/store/local-store.hh' file not found
1 | #include "nix/store/local-store.hh"
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
src/libstore/local-store.cc:1028:5-12: ERROR translating statement 'ReturnStmt'
Aborting translation of method 'nix::existsAndIsOwnedBySelf' in file 'src/libstore/local-store.cc': "Assert_failure src/clang/cAst_utils.ml:249:53"
Uncaught Internal Error: "Assert_failure src/clang/cAst_utils.ml:249:53"
Error backtrace:
Raised at ClangFrontend__CAst_utils.get_decl_from_typ_ptr in file "src/clang/cAst_utils.ml", line 249, characters 53-65
Called from ClangFrontend__CTrans.CTrans_funct.get_destructor_decl_ref in file "src/clang/cTrans.ml", line 658, characters 12-59
Called from ClangFrontend__CTrans.CTrans_funct.destructor_calls.(fun) in file "src/clang/cTrans.ml", line 2048, characters 12-69
Called from Base__List.rev_filter_map.loop in file "src/list.ml", line 944, characters 13-17
Called from Bas

... [truncated 2200 characters] ...

n file "src/clang/cTrans.ml", line 4784, characters 10-1023
Called from ClangFrontend__CTrans.CTrans_funct.instruction in file "src/clang/cTrans.ml" (inlined), line 4765, characters 38-71
Called from ClangFrontend__CTrans.CTrans_funct.exec_with_node_creation in file "src/clang/cTrans.ml" (inlined), line 104, characters 20-38
Called from ClangFrontend__CTrans.CTrans_funct.get_clang_stmt_trans in file "src/clang/cTrans.ml" (inlined), line 5395, characters 4-69
Called from ClangFrontend__CTrans.CTrans_funct.get_custom_stmt_trans in file "src/clang/cTrans.ml", line 5401, characters 8-55
Called from ClangFrontend__CTrans.CTrans_funct.exec_trans_instrs.exec_trans_instrs_rev in file "src/clang/cTrans.ml" (inlined), line 5365, characters 28-54
Called from ClangFrontend__CTrans.CTrans_funct.exec_tr


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

@github-actions github-actions Bot temporarily deployed to pull request June 2, 2026 19:56 Inactive
edolstra and others added 4 commits June 3, 2026 18:48
The `cleanup` mentioned was removed in f0143f5.
This speeds up `nix copy` into a local store.

Previously copyPaths() relied on the default addMultipleToStore(),
which walks the closure in topological order calling addToStore() per
path. This means that we can't start copying a NAR into the store
until all its dependencies have been added, in order to preserve the
closure variant. This may waste available network/CPU capacity.

So now we extract the NARs into the store first before registering the
store paths as valid. Specifically:

1. We acquire locks on all store paths in sorted order (to avoid
   deadlocks).

2. We extract the NARs in order of descending NAR size (to minimize
   the makespan).

3. We register all new paths as valid in a single transaction.

Note: currently this doesn't benefit copying into a local store via
the Nix daemon. (See RemoteStore::addMultipleToStore() which sends
NARs in topological order.) It also doesn't yet benefit substitution,
since it currently substitutes one path at a time. (It should just
rely on copyPaths() to fetch entire missing closures.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The NAR-extraction body (delete, restore, verify NAR hash/size, verify
CA, autoGC, canonicalise, optimise, fsync) was duplicated between
addToStore() and addMultipleToStore(). Extract it into a private
doAddToStore() helper that both call; it writes and verifies a path's
contents but does not take a lock or register validity, which remains
the caller's responsibility.

No functional change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@edolstra edolstra force-pushed the eelcodolstra/nix-408 branch from bebbb70 to 7df8aab Compare June 3, 2026 16:49
@github-actions github-actions Bot temporarily deployed to pull request June 3, 2026 16:54 Inactive
Previously, uf addMultipleToStore() is interrupted, all NARs extracted
so far were lost, since path validity is only registered in a single
transaction at the very end.

To avoid this, doAddToStore() now creates a file `<realPath>.unpacked`
to mark that <realPath> contains a completely unpacked NAR. If that
file exists, doAddToStore() can skip fetching.  The markers are
removed once the paths are registered as valid.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@edolstra edolstra marked this pull request as ready for review June 3, 2026 20:49
@github-actions github-actions Bot temporarily deployed to pull request June 3, 2026 20:55 Inactive
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/libstore/local-store.cc`:
- Around line 1037-1045: After deletePath(realPath) ensure you invoke autoGC()
before unpacking/writing the NAR so GC can free space; specifically, call
autoGC() (the same mechanism used by addToStoreFromDump) before creating
HashSink/TeeSource and before calling restorePath(realPath, wrapperSource,
localSettings.fsyncStorePaths) so the GC check runs prior to the write and
prevents ENOSPC during restorePath.
- Around line 1170-1219: The code currently collects duplicates from pathsToCopy
into maybeToWrite/toWrite, allowing multiple workers for the same store path;
before moving items into maybeToWrite (or before building toWrite) deduplicate
by the canonical StorePath/real path: track a std::set<std::string> (or
std::set<std::filesystem::path>) of the key (e.g. printStorePath(info.path) or
toRealPath(info.path)) and only push a PathWithSource* into maybeToWrite/toWrite
the first time that key is seen; ensure pathsToLock insertion uses the same
canonical key so lock coverage matches queued work (references: pathsToCopy,
PathWithSource, maybeToWrite, pathsToLock, toWrite, toRealPath).
- Around line 1024-1026: The resume-trust check wrongly rejects files based on
group ownership; in the non-Windows block inside existsAndIsOwnedBySelf (or the
function performing the st->st_uid/st->st_gid checks) remove the st_gid
comparison so the condition becomes only check st->st_uid != geteuid(), i.e.
drop the "|| st->st_gid != getegid()" part; leave the surrounding `#ifndef` _WIN32
and the uid check intact so setgid-managed directories (build-users) won't cause
false negatives for .unpacked markers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ffa60980-e2a1-4f7c-bd98-009efaee57ff

📥 Commits

Reviewing files that changed from the base of the PR and between d3e4a6e and a4b946d.

📒 Files selected for processing (2)
  • src/libstore/include/nix/store/local-store.hh
  • src/libstore/local-store.cc

Comment thread src/libstore/local-store.cc
Comment thread src/libstore/local-store.cc
Comment thread src/libstore/local-store.cc
edolstra and others added 4 commits June 3, 2026 23:13
When collecting garbage, delete a path's `.unpacked` marker before
deleting the path itself. Deleting a directory is not atomic, so without
this ordering an interrupted GC could leave the marker next to a
partially-deleted path, causing addMultipleToStore() to later reuse the
corrupt path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The `<realPath>.unpacked` marker filename was constructed in three
places (the addMultipleToStore() reuse check and cleanup, and the GC
deletion). Factor it into a single LocalStore::unpackedMarkerFor()
helper.

No functional change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On some OSes the group could be inherited from the store directory, so
it might be set to the build-users group, which is fine.
In doAddToStore(), call autoGC() right after deleting any pre-existing
path and before restorePath(), rather than after the NAR has already
been written. This lets the garbage collector free up space before the
write, preventing ENOSPC during restorePath(). This matches the
ordering already used by addToStoreFromDump().

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions github-actions Bot temporarily deployed to pull request June 3, 2026 21:52 Inactive
Comment on lines +1260 to +1267
if (!repair && existsAndIsOwnedBySelf(unpackedMarker) && existsAndIsOwnedBySelf(realPath)) {
notice("reusing previously unpacked path at '%s'", printStorePath(info.path));
act.setExpected(actCopyPath, bytesExpected -= info.narSize);
} else {
doAddToStore(info, *item->second, repair);
/* The path has been unpacked, so mark it as such. */
writeFile(unpackedMarker, "");
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if instead of using "file exists" as the sole validation check, we should record the fingerprint (and whatever other stuff), to ensure that we got the same bytes, NAR hash, ca (if applicable), etc? And then we decide validity based on "existence of file AND file contents (fingerprint) matches what we'd expect)"?

And then, in this "else" branch (i.e. "marker wasn't valid by having wrong contents or not existing"), we'd unconditionally remove the marker file (if it existed) and keep doing the thing.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a lot more expensive though. And it shouldn't be needed since nobody except GC can modify the store path.

Comment thread src/libstore/local-store.cc Outdated
@edolstra edolstra requested a review from cole-h June 4, 2026 19:49
@github-actions github-actions Bot temporarily deployed to pull request June 4, 2026 19:53 Inactive
@edolstra edolstra added this pull request to the merge queue Jun 5, 2026
Merged via the queue into main with commit fe36456 Jun 5, 2026
29 checks passed
@edolstra edolstra deleted the eelcodolstra/nix-408 branch June 5, 2026 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants