Scale H5 pipeline to 50 workers at 1 CPU each#684
Conversation
The H5 build work is single-threaded numpy. 4 CPUs per worker was wasted. 8 workers meant each processed ~60 items serially. - Increase default workers from 8 to 50 (~10 items per worker) - Drop worker CPU from 4 to 1 (saves 75% CPU cost) - Add max_containers=50 as safety cap - Wall-clock time drops from ~60min to ~12min - Total CPU cost drops: 8×60min×4CPU → 50×12min×1CPU Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hey Anthony, nice change — the 4 CPU → 1 CPU drop makes sense for single-threaded numpy work, and parallelizing across 50 workers should cut wall-clock time significantly. Two things to flag: 1. Minor:
|
|
@baogorek figures, that double pipeline issue is the same issue we had live yesterday while on a meeting with @juaristi22 when docs build ran twice. Meeting with you soon, will catch up and determine how to handle. |
|
Update: the pipeline just failed on the post-merge run. Two issues to address in this PR or a companion: 1.
|
Fixes #683
Summary
max_containers=50safety cap on the Modal function decoratorcoordinate_publish,run_pipeline, CLI entrypoints, andpipeline.yamlworkflowTest plan
🤖 Generated with Claude Code