feat(build): optimize max_workers based on dependency graph parallelism #881

LalatenduMohanty · 2025-12-12T19:06:43Z

Calculate optimal max_workers by analyzing the dependency graph to find the maximum number of packages that can be built in parallel.

Uses the smaller of CPU-based default and graph-based maximum to avoid allocating idle worker threads.

closes #880

Calculate optimal max_workers by analyzing the dependency graph to find the maximum number of packages that can be built in parallel. Uses the smaller of CPU-based default and graph-based maximum to avoid allocating idle worker threads. closes python-wheel-build#880 Co-Authored-By: Claude <claude@anthropic.com> Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com>

dhellmann · 2025-12-12T21:10:45Z

src/fromager/commands/build.py

+        cpu_default = min(32, (os.cpu_count() or 1) + 4)
+        optimal_workers = min(cpu_default, max_parallelism)
+        logger.info(
+            "graph allows max %i parallel builds, using %i workers (cpu default: %i)",


I don't know that I would understand this log message if I was reading it without having seen the code. Could you log each value separately with a short description before going through this if statement, then log the actual selected worker pool size being returned? Something like the messages below, for example.

You could add messages like "batch size from graph exceeds CPU count" or whatever to help with debugging, too, but just having each number separately would make it easier to understand.

"batch size from graph: %d" "CPU count: %d" "minimum pool size: %d" (from cpu_default) "user requested pool size: %d" "optimal worker pool size: %d"

dhellmann · 2025-12-12T21:24:10Z

src/fromager/commands/build.py

+
+    Analyzes the dependency graph to determine the maximum number of packages
+    that can be built in parallel at any point. Uses this to optimize the
+    number of worker threads, avoiding wasteful allocation of idle workers.


I'm not 100% sure I understand this change. I think you're saying the batch size from the graph should be factored in because there's no point in setting up a worker pool larger than the number of jobs we would try to run at one time. Is that it?

@tiran had an idea at one point to continuously add tasks to the pool as packages finished their build. I don't know if we ever implemented that. If we did that, we might end up wanting to build more than the maximum batch size because finishing 1 item in a batch might let us build another set of items that would be considered to be in another batch based on this logic.

I think you're saying the batch size from the graph should be factored in because there's no point in setting up a worker pool larger than the number of jobs we would try to run at one time. Is that it?

Yes, if we can only run 2 parallel builds (from the way graph is ) but we are setting 6 workers , it is not useful right. It is not a critical issue but a code improvement.

Does Python's worker pool implementation actually create all of the workers based on the size passed in, or does it just limit the number of workers to the size given?

LalatenduMohanty requested a review from a team as a code owner December 12, 2025 19:06

mergify bot added the ci label Dec 12, 2025

dhellmann reviewed Dec 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(build): optimize max_workers based on dependency graph parallelism #881

feat(build): optimize max_workers based on dependency graph parallelism #881

Uh oh!

LalatenduMohanty commented Dec 12, 2025

Uh oh!

dhellmann Dec 12, 2025

Uh oh!

dhellmann Dec 12, 2025

Uh oh!

LalatenduMohanty Dec 12, 2025

Uh oh!

dhellmann Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(build): optimize max_workers based on dependency graph parallelism #881

Are you sure you want to change the base?

feat(build): optimize max_workers based on dependency graph parallelism #881

Uh oh!

Conversation

LalatenduMohanty commented Dec 12, 2025

Uh oh!

dhellmann Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

dhellmann Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

LalatenduMohanty Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

dhellmann Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants