|
123 | 123 | <div class="textblock"><p><a class="anchor" id="autotoc_md58"></a> MFC has been extensively benchmarked on CPUs and GPU devices. A summary of these results follows.</p> |
124 | 124 | <h1><a class="anchor" id="autotoc_md59"></a> |
125 | 125 | Expected time-steps/hour</h1> |
126 | | -<p>The following table outlines expected performance in terms of the number of time steps per hour (rounded to the nearest hundred) for various problem sizes (grid cells) and hardware for an inviscid, 6-equation (‘model_eqns’ : 3`), 3D simulation. CPU results utilize an entire die.</p> |
| 126 | +<p>The following table outlines expected performance in terms of the number of time steps per hour (rounded to the nearest hundred, higher is better). A 3D inviscid, 6-equation (‘'model_eqns’ : 3`) problem is solved for various problem sizes (grid cells) and hardware. CPU results utilize an entire processor die.</p> |
127 | 127 | <table class="markdownTable"> |
128 | 128 | <tr class="markdownTableHead"> |
129 | | -<th class="markdownTableHeadRight">Hardware </th><th class="markdownTableHeadCenter"># Ranks </th><th class="markdownTableHeadCenter">1M Cells </th><th class="markdownTableHeadCenter">4M Cells </th><th class="markdownTableHeadCenter">8M Cells </th><th class="markdownTableHeadCenter">Compiler </th><th class="markdownTableHeadLeft">Computer </th></tr> |
| 129 | +<th class="markdownTableHeadRight">Hardware </th><th class="markdownTableHeadCenter"># Cores </th><th class="markdownTableHeadCenter">Steps/Hr (1M pts) </th><th class="markdownTableHeadCenter">Steps/Hr (4M pts) </th><th class="markdownTableHeadCenter">Steps/Hr (8M pts) </th><th class="markdownTableHeadCenter">Compiler </th><th class="markdownTableHeadLeft">Computer </th></tr> |
130 | 130 | <tr class="markdownTableRowOdd"> |
131 | | -<td class="markdownTableBodyRight">NVIDIA V100 </td><td class="markdownTableBodyCenter">1 </td><td class="markdownTableBodyCenter">88.5k </td><td class="markdownTableBodyCenter">18.7k </td><td class="markdownTableBodyCenter">N/A </td><td class="markdownTableBodyCenter">NVHPC 22.11 </td><td class="markdownTableBodyLeft">PACE Phoenix </td></tr> |
| 131 | +<td class="markdownTableBodyRight">NVIDIA V100 </td><td class="markdownTableBodyCenter">1 (device) </td><td class="markdownTableBodyCenter">88.5k </td><td class="markdownTableBodyCenter">18.7k </td><td class="markdownTableBodyCenter">N/A </td><td class="markdownTableBodyCenter">NVHPC 22.11 </td><td class="markdownTableBodyLeft">PACE Phoenix </td></tr> |
132 | 132 | <tr class="markdownTableRowEven"> |
133 | | -<td class="markdownTableBodyRight">NVIDIA V100 </td><td class="markdownTableBodyCenter">1 </td><td class="markdownTableBodyCenter">78.8k </td><td class="markdownTableBodyCenter">18.8k </td><td class="markdownTableBodyCenter">N/A </td><td class="markdownTableBodyCenter">NVHPC 22.11 </td><td class="markdownTableBodyLeft">OLCF Summit </td></tr> |
| 133 | +<td class="markdownTableBodyRight">NVIDIA V100 </td><td class="markdownTableBodyCenter">1 (device) </td><td class="markdownTableBodyCenter">78.8k </td><td class="markdownTableBodyCenter">18.8k </td><td class="markdownTableBodyCenter">N/A </td><td class="markdownTableBodyCenter">NVHPC 22.11 </td><td class="markdownTableBodyLeft">OLCF Summit </td></tr> |
134 | 134 | <tr class="markdownTableRowOdd"> |
135 | | -<td class="markdownTableBodyRight">NVIDIA A100 </td><td class="markdownTableBodyCenter">1 </td><td class="markdownTableBodyCenter">114.4k </td><td class="markdownTableBodyCenter">34.6k </td><td class="markdownTableBodyCenter">16.5k </td><td class="markdownTableBodyCenter">NVHPC 23.5 </td><td class="markdownTableBodyLeft">Wingtip </td></tr> |
| 135 | +<td class="markdownTableBodyRight">NVIDIA A100 </td><td class="markdownTableBodyCenter">1 (device) </td><td class="markdownTableBodyCenter">114.4k </td><td class="markdownTableBodyCenter">34.6k </td><td class="markdownTableBodyCenter">16.5k </td><td class="markdownTableBodyCenter">NVHPC 23.5 </td><td class="markdownTableBodyLeft">Wingtip </td></tr> |
136 | 136 | <tr class="markdownTableRowEven"> |
137 | | -<td class="markdownTableBodyRight">AMD MI250X </td><td class="markdownTableBodyCenter">1 </td><td class="markdownTableBodyCenter">77.5k </td><td class="markdownTableBodyCenter">22.3k </td><td class="markdownTableBodyCenter">11.2k </td><td class="markdownTableBodyCenter">CCE 16.0.1 </td><td class="markdownTableBodyLeft">OLCF Frontier </td></tr> |
| 137 | +<td class="markdownTableBodyRight">AMD MI250X </td><td class="markdownTableBodyCenter">1 (GCD) </td><td class="markdownTableBodyCenter">77.5k </td><td class="markdownTableBodyCenter">22.3k </td><td class="markdownTableBodyCenter">11.2k </td><td class="markdownTableBodyCenter">CCE 16.0.1 </td><td class="markdownTableBodyLeft">OLCF Frontier </td></tr> |
138 | 138 | <tr class="markdownTableRowOdd"> |
139 | | -<td class="markdownTableBodyRight">Intel Xeon Gold 6226 </td><td class="markdownTableBodyCenter">12 </td><td class="markdownTableBodyCenter">2.5k </td><td class="markdownTableBodyCenter">0.7k </td><td class="markdownTableBodyCenter">0.4k </td><td class="markdownTableBodyCenter">GNU 10.3.0 </td><td class="markdownTableBodyLeft">PACE Phoenix </td></tr> |
| 139 | +<td class="markdownTableBodyRight">Intel Xeon Gold 6226 </td><td class="markdownTableBodyCenter">12 (cores) </td><td class="markdownTableBodyCenter">2.5k </td><td class="markdownTableBodyCenter">0.7k </td><td class="markdownTableBodyCenter">0.4k </td><td class="markdownTableBodyCenter">GNU 10.3.0 </td><td class="markdownTableBodyLeft">PACE Phoenix </td></tr> |
140 | 140 | <tr class="markdownTableRowEven"> |
141 | | -<td class="markdownTableBodyRight">Apple Silicon M2 </td><td class="markdownTableBodyCenter">6 </td><td class="markdownTableBodyCenter">2.8k </td><td class="markdownTableBodyCenter">0.6k </td><td class="markdownTableBodyCenter">0.2k </td><td class="markdownTableBodyCenter">GNU 13.2.0 </td><td class="markdownTableBodyLeft">N/A </td></tr> |
| 141 | +<td class="markdownTableBodyRight">Apple Silicon M2 </td><td class="markdownTableBodyCenter">6 (cores) </td><td class="markdownTableBodyCenter">2.8k </td><td class="markdownTableBodyCenter">0.6k </td><td class="markdownTableBodyCenter">0.2k </td><td class="markdownTableBodyCenter">GNU 13.2.0 </td><td class="markdownTableBodyLeft">N/A </td></tr> |
142 | 142 | </table> |
143 | | -<p>If ‘'model_eqns’ : 3<code>is replaced by</code>'model_eqns' : 2`, an inviscid 5-equation model is used. The following table outlines expected performance in terms of the number of time-steps per hour (rounded to the nearest hundred) for various problem sizes and hardware for an inviscid, 5-equation, 3D simulation. CPU results utilize an entire die.</p> |
| 143 | +<p>If ‘'model_eqns’ : 3<code>is replaced by</code>'model_eqns' : 2`, an inviscid 5-equation model is used. The below table shows expected performance via the number of time steps per hour (rounded to the nearest hundred) for various problem sizes and hardware for an inviscid, 5-equation, 3D simulation. CPU results use an entire processor die.</p> |
144 | 144 | <table class="markdownTable"> |
145 | 145 | <tr class="markdownTableHead"> |
146 | | -<th class="markdownTableHeadRight">Hardware </th><th class="markdownTableHeadCenter"># Ranks </th><th class="markdownTableHeadCenter">1M Cells </th><th class="markdownTableHeadCenter">4M Cells </th><th class="markdownTableHeadCenter">8M Cells </th><th class="markdownTableHeadCenter">Compiler </th><th class="markdownTableHeadLeft">Computer </th></tr> |
| 146 | +<th class="markdownTableHeadRight">Hardware </th><th class="markdownTableHeadCenter"># Cores </th><th class="markdownTableHeadCenter">Steps/Hr (1M pts) </th><th class="markdownTableHeadCenter">Steps/Hr (4M pts) </th><th class="markdownTableHeadCenter">Steps/Hr (8M pts) </th><th class="markdownTableHeadCenter">Compiler </th><th class="markdownTableHeadLeft">Computer </th></tr> |
147 | 147 | <tr class="markdownTableRowOdd"> |
148 | | -<td class="markdownTableBodyRight">NVIDIA V100 </td><td class="markdownTableBodyCenter">1 </td><td class="markdownTableBodyCenter">113.4k </td><td class="markdownTableBodyCenter">26.2k </td><td class="markdownTableBodyCenter">13.0k </td><td class="markdownTableBodyCenter">NVHPC 22.11 </td><td class="markdownTableBodyLeft">PACE Phoenix </td></tr> |
| 148 | +<td class="markdownTableBodyRight">NVIDIA V100 </td><td class="markdownTableBodyCenter">1 (device) </td><td class="markdownTableBodyCenter">113.4k </td><td class="markdownTableBodyCenter">26.2k </td><td class="markdownTableBodyCenter">13.0k </td><td class="markdownTableBodyCenter">NVHPC 22.11 </td><td class="markdownTableBodyLeft">PACE Phoenix </td></tr> |
149 | 149 | <tr class="markdownTableRowEven"> |
150 | | -<td class="markdownTableBodyRight">NVIDIA V100 </td><td class="markdownTableBodyCenter">1 </td><td class="markdownTableBodyCenter">107.7k </td><td class="markdownTableBodyCenter">26.3k </td><td class="markdownTableBodyCenter">13.1k </td><td class="markdownTableBodyCenter">NVHPC 22.11 </td><td class="markdownTableBodyLeft">OLCF Summit </td></tr> |
| 150 | +<td class="markdownTableBodyRight">NVIDIA V100 </td><td class="markdownTableBodyCenter">1 (device) </td><td class="markdownTableBodyCenter">107.7k </td><td class="markdownTableBodyCenter">26.3k </td><td class="markdownTableBodyCenter">13.1k </td><td class="markdownTableBodyCenter">NVHPC 22.11 </td><td class="markdownTableBodyLeft">OLCF Summit </td></tr> |
151 | 151 | <tr class="markdownTableRowOdd"> |
152 | | -<td class="markdownTableBodyRight">NVIDIA A100 </td><td class="markdownTableBodyCenter">1 </td><td class="markdownTableBodyCenter">153.5k </td><td class="markdownTableBodyCenter">48.0k </td><td class="markdownTableBodyCenter">22.5k </td><td class="markdownTableBodyCenter">NVHPC 23.5 </td><td class="markdownTableBodyLeft">Wingtip </td></tr> |
| 152 | +<td class="markdownTableBodyRight">NVIDIA A100 </td><td class="markdownTableBodyCenter">1 (device) </td><td class="markdownTableBodyCenter">153.5k </td><td class="markdownTableBodyCenter">48.0k </td><td class="markdownTableBodyCenter">22.5k </td><td class="markdownTableBodyCenter">NVHPC 23.5 </td><td class="markdownTableBodyLeft">Wingtip </td></tr> |
153 | 153 | <tr class="markdownTableRowEven"> |
154 | | -<td class="markdownTableBodyRight">AMD MI250X </td><td class="markdownTableBodyCenter">1 </td><td class="markdownTableBodyCenter">104.2k </td><td class="markdownTableBodyCenter">31.0k </td><td class="markdownTableBodyCenter">14.8k </td><td class="markdownTableBodyCenter">CCE 16.0.1 </td><td class="markdownTableBodyLeft">OLCF Frontier </td></tr> |
| 154 | +<td class="markdownTableBodyRight">AMD MI250X </td><td class="markdownTableBodyCenter">1 (GCD) </td><td class="markdownTableBodyCenter">104.2k </td><td class="markdownTableBodyCenter">31.0k </td><td class="markdownTableBodyCenter">14.8k </td><td class="markdownTableBodyCenter">CCE 16.0.1 </td><td class="markdownTableBodyLeft">OLCF Frontier </td></tr> |
155 | 155 | <tr class="markdownTableRowOdd"> |
156 | | -<td class="markdownTableBodyRight">Intel Xeon Gold 6226 </td><td class="markdownTableBodyCenter">12 </td><td class="markdownTableBodyCenter">5.4k </td><td class="markdownTableBodyCenter">1.6k </td><td class="markdownTableBodyCenter">0.8k </td><td class="markdownTableBodyCenter">GNU 10.3.0 </td><td class="markdownTableBodyLeft">PACE Phoenix </td></tr> |
| 156 | +<td class="markdownTableBodyRight">Intel Xeon Gold 6226 </td><td class="markdownTableBodyCenter">12 (cores) </td><td class="markdownTableBodyCenter">5.4k </td><td class="markdownTableBodyCenter">1.6k </td><td class="markdownTableBodyCenter">0.8k </td><td class="markdownTableBodyCenter">GNU 10.3.0 </td><td class="markdownTableBodyLeft">PACE Phoenix </td></tr> |
157 | 157 | <tr class="markdownTableRowEven"> |
158 | | -<td class="markdownTableBodyRight">Apple Silicon M2 </td><td class="markdownTableBodyCenter">6 </td><td class="markdownTableBodyCenter">3.7k </td><td class="markdownTableBodyCenter">11.0k </td><td class="markdownTableBodyCenter">0.3k </td><td class="markdownTableBodyCenter">GNU 13.2.0 </td><td class="markdownTableBodyLeft">N/A </td></tr> |
| 158 | +<td class="markdownTableBodyRight">Apple Silicon M2 </td><td class="markdownTableBodyCenter">6 (cores) </td><td class="markdownTableBodyCenter">3.7k </td><td class="markdownTableBodyCenter">11.0k </td><td class="markdownTableBodyCenter">0.3k </td><td class="markdownTableBodyCenter">GNU 13.2.0 </td><td class="markdownTableBodyLeft">N/A </td></tr> |
159 | 159 | </table> |
160 | 160 | <h1><a class="anchor" id="autotoc_md60"></a> |
161 | 161 | Weak scaling</h1> |
|
0 commit comments