Reorganzie dependency installation for better squashing #1523

develra · 2026-01-24T01:37:48Z

I'll leave it up to y'all to decide if the changes/risks here are worth the reduction in image size. Thanks!

Reduced image size

Metric	Original	New	Reduction
Image Size	60.1 GB	48.2 GB	11.9 GB (20%)
Filesystem Size	49 GB	44 GB	5 GB (10%)

Note: Image size includes all layers; filesystem size is the actual disk usage inside the container.

Added --no-cache to uv pip install (Safe)
Cache is only useful for repeated installs in the same environment. In Docker builds, each layer is fresh, so cache provides no benefit.
Removed Intel MKL numpy (Less sure)

Removed the Intel MKL numpy install from Intel's Anaconda channel. Intel's channel only has numpy 1.26.4 (numpy 1.x), but the base image has numpy 2.0.2. Installing Intel's numpy would downgrade and break packages compiled against numpy 2.x ABI.

The base image's numpy 2.0.2 uses OpenBLAS optimizations and is compatible with all installed packages.

Removed preprocessing package (Less sure)
Package is unmaintained (last release 2017) and requires nltk==3.2.4 which is incompatible with Python 3.11 (inspect.formatargspec was removed). Package hasn't been updated in 7+ years and cannot function on Python 3.11.
Updated scikit-learn to 1.5.2 (Less sure)
Changed from scikit-learn==1.2.2 to scikit-learn==1.5.2. scikit-learn 1.2.2 binary wheels are incompatible with numpy 2.x ABI, causing "numpy.dtype size changed" errors. scikit-learn 1.5.x maintains API compatibility with 1.2.x. The original pin was for eli5/learntools compatibility, which should work with 1.5.x.
Added uv cache cleanup to clean-layer.sh (safe)
Added /root/.cache/uv/* to the cleanup script. The script only cleaned pip cache, not uv cache. Cache cleanup scripts are run after package installs; cache is not needed at runtime.

I'll leave it up to y'all to decide if the changes/risks here are worth the reduction in image size. Thanks! Reduced image size ┌─────────────────┬──────────┬─────────┬───────────────┐ │ Metric │ Original │ New │ Reduction │ ├─────────────────┼──────────┼─────────┼───────────────┤ │ Image Size │ 60.1 GB │ 48.2 GB │ 11.9 GB (20%) │ ├─────────────────┼──────────┼─────────┼───────────────┤ │ Filesystem Size │ 49 GB │ 44 GB │ 5 GB (10%) │ └─────────────────┴──────────┴─────────┴───────────────┘ Note: Image size includes all layers; filesystem size is the actual disk usage inside the container. - Added --no-cache to uv pip install (Safe) Cache is only useful for repeated installs in the same environment. In Docker builds, each layer is fresh, so cache provides no benefit. - Removed Intel MKL numpy (Less sure) Removed the Intel MKL numpy install from Intel's Anaconda channel. Intel's channel only has numpy 1.26.4 (numpy 1.x), but the base image has numpy 2.0.2. Installing Intel's numpy would downgrade and break packages compiled against numpy 2.x ABI. The base image's numpy 2.0.2 uses OpenBLAS optimizations and is compatible with all installed packages. - Removed preprocessing package (Less sure) Package is unmaintained (last release 2017) and requires nltk==3.2.4 which is incompatible with Python 3.11 (inspect.formatargspec was removed). Package hasn't been updated in 7+ years and cannot function on Python 3.11. - Updated scikit-learn to 1.5.2 (Less sure) Changed from scikit-learn==1.2.2 to scikit-learn==1.5.2. scikit-learn 1.2.2 binary wheels are incompatible with numpy 2.x ABI, causing "numpy.dtype size changed" errors. scikit-learn 1.5.x maintains API compatibility with 1.2.x. The original pin was for eli5/learntools compatibility, which should work with 1.5.x. - Added uv cache cleanup to clean-layer.sh (safe) Added /root/.cache/uv/* to the cleanup script. The script only cleaned pip cache, not uv cache. Cache cleanup scripts are run after package installs; cache is not needed at runtime.

develra requested review from calderjo and djherbis January 24, 2026 01:37

develra force-pushed the optimize-layers-for-better-squashing branch from 9773e95 to 8abc702 Compare January 24, 2026 01:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reorganzie dependency installation for better squashing #1523

Reorganzie dependency installation for better squashing #1523

develra commented Jan 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Reorganzie dependency installation for better squashing #1523

Are you sure you want to change the base?

Reorganzie dependency installation for better squashing #1523

Conversation

develra commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

develra commented Jan 24, 2026 •

edited

Loading