Skip to content

Severe negative MPI scaling with the latest dev branch #68

@gthyagi

Description

@gthyagi

Summary

From recent timing plots of stokes_timing.txt, the BH92 no-fault benchmark shows severe performance degradation with increasing MPI ranks.


Additional Note

I believe this issue is not specific to the BH92 setup and can likely be replicated with any Stokes solver configuration.

In this case, I was running linear solves only (no nonlinear iterations), so the scaling degradation appears to occur at the linear solver / MPI / preconditioner level rather than due to nonlinear solve complexity.


The measured PETSc Time (sec) (Max column) is:

ncpus=1: 35.91 s
ncpus=2: 32.02 s
ncpus=4: 55.96 s
ncpus=6: 509.5 s
ncpus=8: 1397 s

Expected Behaviour

For this problem size, we expect:

  • Monotonic decrease in runtime with increasing MPI ranks, or
  • Mild saturation due to communication overhead

Observed Behaviour

  • Runtime increases significantly beyond 2 MPI ranks
  • Severe scaling breakdown between 4 → 6 → 8 ranks
  • ~40× slowdown at 8 ranks compared to 1 rank

This suggests a parallel scaling or MPI-related performance issue.


Environment

  • Underworld3 branch: development (local working branch with write_timestep / XDMF updates)
  • Python: 3.12 (amr-dev pixi environment)
  • MPI: MPICH 4.3.2 (ch4:ofi)
  • PETSc (from logs): 3.24.5
  • Date observed: March 3, 2026

Timing Plot

Image

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions