Skip to content

ci(cuda): limit parallel jobs to avoid OOM during CUDA build#840

Closed
doringeman wants to merge 1 commit intodocker:mainfrom
doringeman:llamacpp-cuda-build
Closed

ci(cuda): limit parallel jobs to avoid OOM during CUDA build#840
doringeman wants to merge 1 commit intodocker:mainfrom
doringeman:llamacpp-cuda-build

Conversation

@doringeman
Copy link
Copy Markdown
Contributor

No description provided.

Signed-off-by: Dorin Geman <dorin.geman@docker.com>
@doringeman doringeman closed this Apr 6, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • Using -j$(nproc --ignore=2) assumes the base image’s nproc supports --ignore; consider either verifying this for all target environments or using a more portable pattern (e.g., a fixed max or a small shell calculation) to avoid build-time failures.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Using `-j$(nproc --ignore=2)` assumes the base image’s `nproc` supports `--ignore`; consider either verifying this for all target environments or using a more portable pattern (e.g., a fixed max or a small shell calculation) to avoid build-time failures.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the llamacpp/native/cuda.Dockerfile to parallelize the build process using nproc. Feedback indicates that using nproc inside a container can lead to Out-Of-Memory (OOM) issues on high-core hosts because it reflects host resources rather than container limits. It is recommended to use a fixed, conservative job limit to ensure stability across different environments.

-S ." > cmake-flags
RUN cmake $(cat cmake-flags)
RUN cmake --build build --config Release
RUN cmake --build build --config Release -j$(nproc --ignore=2)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Using nproc inside a Docker container typically reports the host's CPU count rather than the container's allocated resources or available memory. On high-core hosts, nproc --ignore=2 will still spawn a large number of parallel jobs, likely leading to the exact OOM issues this PR aims to avoid. To ensure build stability across various CI environments, it is safer to use a fixed, conservative limit (e.g., -j 4), consistent with the approach taken in llamacpp/native/generic.Dockerfile.

RUN cmake --build build --config Release -j 4
References
  1. Pragmatism: The solution should match the complexity of the problem and avoid failure modes like OOM on large hosts. A fixed limit is more robust in containerized environments. (link)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant