Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion llamacpp/native/cuda.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ RUN echo "-B build \
-GNinja \
-S ." > cmake-flags
RUN cmake $(cat cmake-flags)
RUN cmake --build build --config Release
RUN cmake --build build --config Release -j$(nproc --ignore=2)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Using nproc inside a Docker container typically reports the host's CPU count rather than the container's allocated resources or available memory. On high-core hosts, nproc --ignore=2 will still spawn a large number of parallel jobs, likely leading to the exact OOM issues this PR aims to avoid. To ensure build stability across various CI environments, it is safer to use a fixed, conservative limit (e.g., -j 4), consistent with the approach taken in llamacpp/native/generic.Dockerfile.

RUN cmake --build build --config Release -j 4
References
  1. Pragmatism: The solution should match the complexity of the problem and avoid failure modes like OOM on large hosts. A fixed limit is more robust in containerized environments. (link)

RUN cmake --install build --config Release --prefix install

RUN rm install/bin/*.py
Expand Down
Loading