Skip to content

feat: add --external-only flag to cargo chef prepare#360

Open
josecelano wants to merge 8 commits into
LukeMathWalker:mainfrom
josecelano:359-external-only-flag
Open

feat: add --external-only flag to cargo chef prepare#360
josecelano wants to merge 8 commits into
LukeMathWalker:mainfrom
josecelano:359-external-only-flag

Conversation

@josecelano
Copy link
Copy Markdown

Fixes #359


Problem

In a large Cargo workspace the standard cargo-chef pattern has an unfortunate property: the cook layer is invalidated whenever any workspace manifest changes — adding a new internal crate, splitting one, adding a [[bin]] target, renaming a crate, etc. — even when no external (third-party) dependency changed at all.

The root cause is that recipe.json captures both external deps and the intra-workspace path = "..." dependencies. Any structural change produces a different JSON file, which Docker treats as a changed input and invalidates the entire RUN cargo chef cook layer, forcing a full recompile of all third-party crates.

In a real-world workspace with 26 internal crates (torrust-tracker) the cook cache is effectively always cold: every feature branch that touches any manifest recompiles hundreds of MB of third-party crates from source.

Related issues: #314, #75, #181, #305.


Solution

Add --external-only to cargo chef prepare:

# Stable layer — only invalidated when an external dep actually changes
cargo chef prepare --external-only --recipe-path recipe-thirdparty.json
cargo chef cook   --release       --recipe-path recipe-thirdparty.json

# Fast layer — on top of the stable layer, only stubs are (re)compiled
cargo chef prepare --recipe-path recipe.json
cargo chef cook   --release       --recipe-path recipe.json

The flag strips all intra-workspace path = "..." dependencies from the recipe before serialisation, producing a stable "third-party only" recipe that only changes when an external dependency is added, removed, or updated.


What was changed

File Change
src/skeleton/external_only.rs New module: strip_path_deps() and strip_local_packages_from_lock() operating on toml::Value (7 unit tests)
src/skeleton/mod.rs New public method Skeleton::strip_path_deps(&mut self)
src/recipe.rs external_only: bool parameter added to Recipe::prepare()
src/main.rs --external-only flag added to the Prepare subcommand
tests/skeletons/tests/external_only.rs 3 integration tests
tests/recipe.rs Updated call to Recipe::prepare with new parameter
CHANGELOG.md Entry under [Unreleased]

How it works

  1. Strips path-based dependency entries from every manifest using toml::Value (structural, not regex). Handles: top-level dep sections, [target.'cfg(...)'.dependencies], and [workspace.dependencies].

  2. Removes local (no-source) package entries from the lock file. External packages always have source = "registry+...". Removing local entries makes the recipe immune to workspace-membership changes.

  3. Handles workspace-inherited path deps ({ workspace = true }, Rust 1.64+) via a two-pass approach: first collect the names of [workspace.dependencies] entries with path =, then remove both the workspace declaration and any member { workspace = true } references to those names.

Breaking API change

Recipe::prepare is a pub function re-exported from lib.rs. Adding the positional external_only: bool parameter is technically a breaking change for any crate that calls it directly. cargo-chef has no documented API stability promise so this has zero real-world impact, but it is worth noting. If the API is stabilised in the future, a PrepareOptions struct would be the right approach to absorb future flags without further breakage.


Example

examples/workspace-split-cache/ contains a minimal 3-crate workspace (app → logic → model) that reproduces the problem and demonstrates both solutions:

  • Dockerfile — workaround using strip_path_deps.py (works with stock cargo-chef today)
  • Dockerfile.native — clean version using --external-only
  • README.md — step-by-step guide with measured timings

Measured improvement: cold third-party cook ≈ 5.8 s; subsequent full-recipe cook with warm third-party layer ≈ 0.07 s (83× faster).


Test summary

cargo test -- external_only

running 10 tests
test tests::external_only::external_only_strips_local_packages_from_lock_file ... ok
test tests::external_only::external_only_strips_path_deps_from_manifests ... ok
test tests::external_only::external_only_stable_across_workspace_internal_change ... ok
test skeleton::external_only::tests::keeps_external_deps_untouched ... ok
test skeleton::external_only::tests::strips_path_dep_from_dependencies ... ok
test skeleton::external_only::tests::strips_path_dep_from_target_specific_dependencies ... ok
test skeleton::external_only::tests::strips_path_dep_from_workspace_dependencies ... ok
test skeleton::external_only::tests::strips_workspace_true_refs_to_workspace_path_deps ... ok
test skeleton::external_only::tests::strips_all_path_deps_leaving_valid_manifest ... ok
test skeleton::external_only::tests::strips_local_packages_from_lock_file ... ok

test result: ok. 10 passed

The external_only_stable_across_workspace_internal_change integration test directly verifies the core property: two skeletons that differ only in their internal path deps produce identical recipes after strip_path_deps().

…ird-party cache split problem

Adds examples/workspace-split-cache/ — a minimal 3-crate workspace
(app → logic → model) that reproduces the issue where any structural
change to workspace manifests invalidates the entire cargo-chef cook
layer, forcing a full recompile of all third-party dependencies.

Includes:
- Workspace and three crates (app, logic, model)
- Dockerfile: workaround using strip_path_deps.py (works with stock cargo-chef)
- Dockerfile.native: clean version using the --external-only flag
- strip_path_deps.py: stdlib-only Python post-processor for the workaround
- README.md: step-by-step guide with real command output and timings

Measured improvement: cold third-party cook = 5.8s; subsequent full
recipe cook with warm third-party layer = 0.07s (83x faster).

Closes LukeMathWalker#359
Adds a new --external-only flag that strips all intra-workspace
path = "..." dependencies from the recipe before serialisation. The
resulting recipe is stable across any workspace-internal change (adding
or removing crates, renaming binaries, etc.) and only changes when an
external dependency is added or updated.

How it works:
1. Strips every [dependencies.*] entry that carries a path field from
   all manifests — top-level sections, [target.'cfg(...)'.dependencies],
   and [workspace.dependencies].
2. Two-pass approach for workspace-inherited deps: collects names of
   [workspace.dependencies] entries with path = first, then also removes
   { workspace = true } references to those names from member manifests.
3. Removes local (no-source) [[package]] entries from the lock file so
   workspace-membership changes don't bust the recipe either.

Typical Dockerfile usage:

  # Stable layer — only invalidated when external deps change
  RUN cargo chef prepare --external-only --recipe-path recipe-thirdparty.json
  RUN cargo chef cook --release --recipe-path recipe-thirdparty.json

  # Fast layer — invalidated on manifest changes, third-party already warm
  RUN cargo chef prepare --recipe-path recipe.json
  RUN cargo chef cook --release --recipe-path recipe.json

Files changed:
- src/skeleton/external_only.rs  (new module, 7 unit tests)
- src/skeleton/mod.rs            (new Skeleton::strip_path_deps method)
- src/recipe.rs                  (external_only param on Recipe::prepare)
- src/main.rs                    (--external-only CLI flag on Prepare)
- tests/skeletons/tests/external_only.rs  (3 integration tests)
- tests/recipe.rs                (updated call site)
- CHANGELOG.md

Note: adding external_only: bool to Recipe::prepare is a breaking change
for library users. cargo-chef has no documented API stability promise so
this has zero real-world impact.

Fixes LukeMathWalker#359
Copilot AI review requested due to automatic review settings June 5, 2026 06:24
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new --external-only mode for cargo chef prepare to generate a “third-party only” recipe by stripping path-based dependencies from manifests and removing local packages from the lockfile, improving Docker cache stability for large workspaces.

Changes:

  • Add --external-only CLI flag and plumb it through Recipe::prepare to post-process the derived Skeleton.
  • Implement path-deps + local-lock-entry stripping in src/skeleton/external_only.rs and expose it via Skeleton::strip_path_deps.
  • Add integration/unit tests and a new example showing the two-layer Docker caching strategy.

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/skeletons/tests/mod.rs Registers the new external_only integration test module.
tests/skeletons/tests/external_only.rs Adds integration tests for manifest/lock stripping and stability guarantees.
tests/recipe.rs Updates test helper to match the new Recipe::prepare signature.
src/skeleton/mod.rs Exposes Skeleton::strip_path_deps() and wires in the new module.
src/skeleton/external_only.rs Implements TOML-aware stripping for manifests + lockfile and adds unit tests.
src/recipe.rs Adds external_only: bool to Recipe::prepare and applies stripping.
src/main.rs Adds --external-only flag to the CLI and passes it to recipe preparation.
examples/workspace-split-cache/* Adds a full example (Dockerfiles + sample workspace + Python script) documenting the approach.
CHANGELOG.md Documents the new flag under Unreleased.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/recipe.rs Outdated
Comment on lines 54 to 64
pub fn prepare(
base_path: PathBuf,
member: Option<String>,
external_only: bool,
) -> Result<Self, anyhow::Error> {
let mut skeleton = Skeleton::derive(base_path, member)?;
if external_only {
skeleton.strip_path_deps();
}
Ok(Recipe { skeleton })
}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit e13c398.

Introduced PrepareOptions { member: Option<String>, external_only: bool } with #[derive(Default)]. Recipe::prepare now takes PrepareOptions instead of two positional parameters. This keeps call sites readable and lets new options be added in the future without any signature breakage.

PrepareOptions is re-exported from lib.rs so library users can construct it. All internal call sites (src/main.rs, tests/recipe.rs) updated.

Comment thread src/skeleton/mod.rs Outdated
Comment on lines +84 to +85
/// Remove all intra-workspace `path = "…"` dependencies from every manifest and
/// all local (no-`source`) entries from the lock file.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit d8d249f.

Chose to update the docs to match the implementation rather than restricting the logic. Determining at runtime whether a path points "inside" the workspace or to an out-of-tree crate would require resolving the path relative to each manifest, which adds complexity and edge cases for no practical benefit — users invoking --external-only want to strip all local path deps.

Updated all three doc locations:

  • src/skeleton/external_only.rs module doc
  • src/skeleton/mod.rs strip_path_deps doc comment
  • src/main.rs CLI --external-only flag description

Also added a unit test strips_out_of_workspace_path_dep verifying that an absolute path dep (path = "/opt/some-other-project") is stripped as expected.

Comment thread src/skeleton/mod.rs Outdated
Comment on lines +94 to +96
pub fn strip_path_deps(&mut self) {
external_only::strip_path_deps(&mut self.manifests, &mut self.lock_file);
}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit d8d249f (same as the other thread on this file — both comments address the same doc/impl mismatch).

Chose to update the docs to match the implementation rather than restricting the logic. All doc strings now say "all path = \"…\" dependencies" instead of "intra-workspace path dependencies". A unit test strips_out_of_workspace_path_dep was added to confirm the broader behaviour.

Comment on lines +277 to +290
// Assert: after stripping, both skeletons are identical (same manifests)
assert_eq!(
skeleton_a.manifests.len(),
skeleton_b.manifests.len(),
"manifest count should be equal"
);
for (ma, mb) in skeleton_a.manifests.iter().zip(skeleton_b.manifests.iter()) {
assert_eq!(
ma.contents,
mb.contents,
"manifest {} should be identical after strip_path_deps",
ma.relative_path.display()
);
}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 66873fd.

Both skeleton_a.manifests and skeleton_b.manifests are now sorted by relative_path before the zip, so the comparison is always between the same manifest in both skeletons regardless of filesystem traversal order.

Comment on lines +174 to +183
assert!(
!lock_file.contains("\"app\""),
"local 'app' should be removed from lock:\n{}",
lock_file,
);
assert!(
!lock_file.contains("\"model\""),
"local 'model' should be removed from lock:\n{}",
lock_file,
);
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit a60ba5b.

Added toml as a dev-dependency and replaced all substring checks with proper TOML parsing:

  1. Deserialise the stripped lock file with toml::from_str.
  2. Collect the name fields of every [[package]] entry into a Vec<&str>.
  3. Assert against the vec — no substring ambiguity, and failure messages now print the full list of package names.

The test also serves as an implicit check that strip_local_packages_from_lock produces valid TOML, since the parse would fail otherwise.

…ons struct

Adding external_only: bool as a positional parameter makes the API fragile —
every future flag requires another positional bool and breaks existing callers.

Introduce PrepareOptions { member, external_only } with #[derive(Default)].
This keeps the call site readable and allows new options to be added without
further signature breakage.

Update all call sites (src/main.rs, tests/recipe.rs) and re-export
PrepareOptions from lib.rs so library users can construct it.
…ra-workspace

The implementation removes every dependency entry that has a 'path' field,
regardless of whether the path points to a workspace member or to an
out-of-tree crate. The old docs said 'intra-workspace path dependencies'
which was inaccurate.

Update:
- src/skeleton/external_only.rs module doc
- src/skeleton/mod.rs strip_path_deps doc comment
- src/main.rs --external-only CLI flag description

Add a new unit test 'strips_out_of_workspace_path_dep' that verifies an
absolute path dep (e.g. path = "/opt/some-other-project") is also removed.
skeleton.manifests ordering depends on filesystem traversal order, which
is not guaranteed to be deterministic.  Zipping two unsorted vecs could
compare manifests from different packages against each other, making the
test flaky without any stripping logic being broken.

Sort both manifest slices by relative_path before the zip so the
comparison is always between the same manifest in both skeletons.
…test

The previous assertions used substring matching (lock_file.contains("\"app\""))
which can produce false positives if another package name or dependency
string happens to contain the same substring.

Add toml as a dev-dependency and replace the substring checks with proper
TOML parsing: deserialise the stripped lock file, collect the 'name' fields
of every [[package]] entry, and assert against the resulting Vec<&str>.
This is both more robust and produces clearer failure messages.
@josecelano josecelano requested a review from Copilot June 5, 2026 06:50
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 23 changed files in this pull request and generated 4 comments.

Comment thread src/skeleton/external_only.rs Outdated
Comment on lines +106 to +107
let mut value: toml::Value = toml::from_str(contents)
.expect("cargo-chef produced invalid TOML for a manifest — this is a bug");
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Both strip_from_manifest_contents and strip_local_packages_from_lock now return anyhow::Result<String> instead of calling .expect(). TOML parse and serialisation failures are converted to anyhow::Error via .context(...) and propagated up through strip_path_depsSkeleton::strip_path_deps (now -> anyhow::Result<()>) → Recipe::prepare (already Result, uses ?).

Three new unit tests cover the error paths:

  • returns_error_on_invalid_manifest_toml — verifies Err is returned (not a panic) for malformed manifest TOML
  • returns_error_on_invalid_lock_file_toml — same for the lock file
  • strip_path_deps_propagates_manifest_parse_error — verifies the public entry-point propagates manifest errors correctly

Comment on lines +162 to +166
let mut lock: toml::Value = toml::from_str(lock_str)
.expect("cargo-chef produced invalid TOML for the lock file — this is a bug");
if let Some(packages) = lock.get_mut("package").and_then(|p| p.as_array_mut()) {
// Keep only packages that have a `source` field — those are external.
packages.retain(|pkg| pkg.get("source").is_some());
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed alongside the manifest case (same commit). strip_local_packages_from_lock now returns anyhow::Result<String> — the .expect() on both toml::from_str and toml::to_string are replaced with ? and .context(...). The error surfaces through strip_path_depsSkeleton::strip_path_depsRecipe::prepare rather than panicking.

A dedicated unit test returns_error_on_invalid_lock_file_toml confirms malformed lock file TOML yields Err instead of a panic.

Comment thread src/skeleton/mod.rs Outdated
Comment on lines +98 to +101
/// See [`external_only`] for details.
pub fn strip_path_deps(&mut self) {
external_only::strip_path_deps(&mut self.manifests, &mut self.lock_file);
}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Removed the /// See [\external_only`] for details.line from theSkeleton::strip_path_depsdoc comment. The privatemod external_only` is an implementation detail — the relevant behaviour is already fully described in the preceding doc comment paragraphs, so no public link is needed.

Comment thread CHANGELOG.md Outdated

### Added

- Add `--external-only` flag to `cargo chef prepare` to strip intra-workspace `path = "..."` dependencies from the recipe, enabling a stable Docker cache layer for third-party dependencies that is not invalidated by workspace-internal changes ([#359](https://github.com/LukeMathWalker/cargo-chef/issues/359))
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. The changelog entry now reads:

Add --external-only flag to cargo chef prepare to strip all path = "..." dependency entries from the recipe (including intra-workspace path deps and any other local path deps), enabling a stable Docker cache layer for third-party dependencies that is not invalidated by workspace-internal changes

The original "intra-workspace" wording was inaccurate — the implementation strips all entries with a path field, not just those pointing at workspace members.

- Make strip_from_manifest_contents and strip_local_packages_from_lock
  return anyhow::Result instead of calling .expect(), eliminating panics
  on malformed TOML input
- Propagate errors through strip_path_deps, Skeleton::strip_path_deps,
  and Recipe::prepare (all already fallible or now updated)
- Add three new unit tests for the error cases:
  returns_error_on_invalid_manifest_toml,
  returns_error_on_invalid_lock_file_toml,
  strip_path_deps_propagates_manifest_parse_error
- Remove broken intra-doc link to private module 'external_only' in
  Skeleton::strip_path_deps doc comment
- Fix CHANGELOG wording from 'intra-workspace path deps' to
  'all path = "..." dependency entries (including intra-workspace)'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add --external-only flag to cargo chef prepare to split third-party and workspace-internal cache layers

2 participants