ci(cuda): limit parallel jobs to avoid OOM during CUDA build#840
ci(cuda): limit parallel jobs to avoid OOM during CUDA build#840doringeman wants to merge 1 commit intodocker:mainfrom
Conversation
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Using
-j$(nproc --ignore=2)assumes the base image’snprocsupports--ignore; consider either verifying this for all target environments or using a more portable pattern (e.g., a fixed max or a small shell calculation) to avoid build-time failures.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Using `-j$(nproc --ignore=2)` assumes the base image’s `nproc` supports `--ignore`; consider either verifying this for all target environments or using a more portable pattern (e.g., a fixed max or a small shell calculation) to avoid build-time failures.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request modifies the llamacpp/native/cuda.Dockerfile to parallelize the build process using nproc. Feedback indicates that using nproc inside a container can lead to Out-Of-Memory (OOM) issues on high-core hosts because it reflects host resources rather than container limits. It is recommended to use a fixed, conservative job limit to ensure stability across different environments.
| -S ." > cmake-flags | ||
| RUN cmake $(cat cmake-flags) | ||
| RUN cmake --build build --config Release | ||
| RUN cmake --build build --config Release -j$(nproc --ignore=2) |
There was a problem hiding this comment.
Using nproc inside a Docker container typically reports the host's CPU count rather than the container's allocated resources or available memory. On high-core hosts, nproc --ignore=2 will still spawn a large number of parallel jobs, likely leading to the exact OOM issues this PR aims to avoid. To ensure build stability across various CI environments, it is safer to use a fixed, conservative limit (e.g., -j 4), consistent with the approach taken in llamacpp/native/generic.Dockerfile.
RUN cmake --build build --config Release -j 4
References
- Pragmatism: The solution should match the complexity of the problem and avoid failure modes like OOM on large hosts. A fixed limit is more robust in containerized environments. (link)
No description provided.