Automatically select apropriate parallel slice#3094
Automatically select apropriate parallel slice#3094dschwoerer wants to merge 25 commits intofci-auto-with-debug-higher-orderfrom
Conversation
Useful for Field3D with parallel slices, so we can write `f[i.yp()]` instead of `f.yup()[i.yp()]`
This is needed wo we can both write `f[i.yp()]` while allowing old code to call `f.yup()[i.yp()]` as well.
Interpolation methods might need that, but in general it is unlike to be of use.
The tests are probably not meant to work with parallel slices.
|
This is really neat, but my major worry is having any conditionals in |
|
Adding a new type would require a lot of new code. Do you have some benchmarks that suggest that this a significant overhead? |
|
We did a lot of benchmarking when we first made |
|
Do you have a suggestion for a performance test case? |
|
I just realised with this and can be simplified to if which we probably have already :-) |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
| } | ||
|
|
||
| inline SpecificInd xp(int dx = 1) const { return {ind + (dx * ny * nz), ny, nz}; } | ||
| inline SpecificInd xp(int dx = 1) const { |
There was a problem hiding this comment.
warning: function 'xp' has inline specifier but is implicitly inlined [readability-redundant-inline-specifier]
| inline SpecificInd xp(int dx = 1) const { | |
| SpecificInd xp(int dx = 1) const { |
| return {ind + (dy * nz), ny, nz}; | ||
| return {ind + (dy * nz), ny, nz, yoffset + dy}; | ||
| } | ||
| inline SpecificInd yp_no_parallel_shift(int dy = 1) const { |
There was a problem hiding this comment.
warning: function 'yp_no_parallel_shift' has inline specifier but is implicitly inlined [readability-redundant-inline-specifier]
| inline SpecificInd yp_no_parallel_shift(int dy = 1) const { | |
| SpecificInd yp_no_parallel_shift(int dy = 1) const { |
| // Add region for parallel slices | ||
| for (int i = 1; i <= ystart; ++i) { | ||
| for (const int offset : {-i, i}) { | ||
| const auto region = fmt::format("RGN_YPAR_{:+d}", offset); |
There was a problem hiding this comment.
warning: no header providing "fmt::format" is directly included [misc-include-cleaner]
src/mesh/mesh.cxx:12:
- #include "impls/bout/boutmesh.hxx"
+ #include "fmt/core.h"
+ #include "impls/bout/boutmesh.hxx"| #include "bout/assert.hxx" | ||
| #include "bout/bout_types.hxx" | ||
| #include "bout/boutexception.hxx" | ||
| #include "bout/field.hxx" |
There was a problem hiding this comment.
warning: included header boutexception.hxx is not used directly [misc-include-cleaner]
| #include "bout/field.hxx" | |
| #include "bout/field.hxx" |
include/bout/field3d.hxx
Outdated
| ASSERT2(d.yoffset == yoffset); | ||
| } | ||
| } | ||
| return data[d.ind]; |
There was a problem hiding this comment.
warning: expected member name or ';' after declaration specifiers [clang-diagnostic-error]
return data[d.ind];
^| DirectionTypes directions_in = {YDirectionType::Standard, | ||
| ZDirectionType::Average}, | ||
| std::optional<size_t> region = {}); | ||
| std::optional<size_t> region = {}, int yoffset = 0); |
There was a problem hiding this comment.
warning: no header providing "size_t" is directly included [misc-include-cleaner]
include/bout/field2d.hxx:26:
- class Field2D;
+ #include <cstddef>
+ class Field2D;| DirectionTypes directions_in = {YDirectionType::Standard, | ||
| ZDirectionType::Average}, | ||
| std::optional<size_t> region = {}); | ||
| std::optional<size_t> region = {}, int yoffset = 0); |
There was a problem hiding this comment.
warning: no header providing "std::optional" is directly included [misc-include-cleaner]
include/bout/field2d.hxx:26:
- class Field2D;
+ #include <optional>
+ class Field2D;| DirectionTypes directions_in = {YDirectionType::Standard, | ||
| ZDirectionType::Standard}, | ||
| std::optional<size_t> regionID = {}); | ||
| std::optional<size_t> regionID = {}, int yoffset = 0); |
There was a problem hiding this comment.
warning: no header providing "size_t" is directly included [misc-include-cleaner]
include/bout/field3d.hxx:22:
- class Field3D;
+ #include <cstddef>
+ class Field3D;| return data[d.ind]; | ||
| } | ||
| const BoutReal& operator[](const Ind3D& d) const { | ||
| return (*const_cast<Field3D*>(this))[d]; |
There was a problem hiding this comment.
warning: do not use const_cast to remove const qualifier [cppcoreguidelines-pro-type-const-cast]
return (*const_cast<Field3D*>(this))[d];
^| Options* track(const BoutReal& change, std::string operation); | ||
|
|
||
| public: | ||
| int yoffset{0}; |
There was a problem hiding this comment.
warning: member variable 'yoffset' has public visibility [cppcoreguidelines-non-private-member-variables-in-classes]
int yoffset{0};
^|
|
||
| FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in, | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID)) | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID), |
There was a problem hiding this comment.
warning: no header providing "UNUSED" is directly included [misc-include-cleaner]
src/field/fieldperp.cxx:25:
- #include <bout/boutcomm.hxx>
+ #include "bout/unused.hxx"
+ #include <bout/boutcomm.hxx>|
|
||
| FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in, | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID)) | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID), |
There was a problem hiding this comment.
warning: no header providing "size_t" is directly included [misc-include-cleaner]
src/field/fieldperp.cxx:29:
+ #include <cstddef>|
|
||
| FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in, | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID)) | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID), |
There was a problem hiding this comment.
warning: no header providing "std::optional" is directly included [misc-include-cleaner]
src/field/fieldperp.cxx:29:
+ #include <optional>| { | ||
| auto indc = ind; | ||
| indc.yoffset = 1; | ||
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = |
There was a problem hiding this comment.
warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = | |
| data[(stripe_size * ind.ind) + static_cast<int>(Offset::Byup)] = coords->Bxy.yup()[indc]; |
| indc.yoffset = 1; | ||
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = | ||
| coords->Bxy.yup()[indc]; | ||
| indc.yoffset = -1; |
There was a problem hiding this comment.
warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]
| indc.yoffset = -1; | |
| data[(stripe_size * ind.ind) + static_cast<int>(Offset::Bydown)] = |
|
|
||
| FieldPerp::FieldPerp(Mesh* localmesh, CELL_LOC location_in, int yindex_in, | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID)) | ||
| DirectionTypes directions, std::optional<size_t> UNUSED(regionID), |
There was a problem hiding this comment.
warning: no header providing "DirectionTypes" is directly included [misc-include-cleaner]
DirectionTypes directions, std::optional<size_t> UNUSED(regionID),
^| { | ||
| auto indc = ind; | ||
| indc.yoffset = 1; | ||
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = |
There was a problem hiding this comment.
warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = | |
| data[(stripe_size * ind.ind) + static_cast<int>(Offset::Byup)] = |
| data[stripe_size * ind.ind + static_cast<int>(Offset::Byup)] = | ||
| coords->Bxy.yup()[indc]; | ||
| indc.yoffset = -1; | ||
| data[stripe_size * ind.ind + static_cast<int>(Offset::Bydown)] = |
There was a problem hiding this comment.
warning: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses]
| data[stripe_size * ind.ind + static_cast<int>(Offset::Bydown)] = | |
| data[(stripe_size * ind.ind) + static_cast<int>(Offset::Bydown)] = |
|
Profiled with hyperfine: Unit tests in Release with CHECKS=0:
Unit tests in Debug with CHECKS=2:
test-drift-instability integration test in Release with CHECKS=0:
test-drift-instability integration test in Debug with CHECKS=2:
|
|
Amazing, thanks @tomc271! So it seems there is quite some overhead (8%) without checks. Looking at the absolute numbers for I am slightly worried about the 8% - that is probably not great. It would make the code however more readable, which also a benefit for non-FCI users. Maybe we could make make a compile time switch to enable parallel slices, so that users that do not use it, can keep it disabled. However, then we either cannot start using this feature and clean up the code, or we are kind of forced to stay with it, even if we cannot come up with a faster implementation ... |
|
|
||
| BoutReal& operator[](const Ind3D& d) { return data[d.ind]; } | ||
| const BoutReal& operator[](const Ind3D& d) const { return data[d.ind]; } | ||
| BoutReal& operator[](const Ind3D& d) { |
There was a problem hiding this comment.
It has been a long time since I've looked at this code but could this be return ynext(d.yoffset - yoffset)[d]? One could then still include some of the consistency checking but it would move some of the logic into CHECK regions that disappear in production runs?
There was a problem hiding this comment.
return ynext(d.yoffset - yoffset)[d] would always call ynext - even if this already the right field, and thus would fail.
The issue is that both the main field, which contains parallel fields, as well as the parallel fields, are of type Field3D.
We could (together with PR #3149) only enable this for Field3DParallel. There might be subtle bugs, where it depends on the type (Field3DParallel/Field3D) on whether it uses the correct indexing. But we could add those with higher CHECK level, so that should be fine.
| return ynext(d.yoffset)[d]; | ||
| } | ||
| #if CHECK >= 2 | ||
| if (isFci()) { // We probably should assert here that this is field aligned |
There was a problem hiding this comment.
Could this be if ( ! hasParallelSlices() && isFci()) and placed before the return line so the return doesn't need to be guarded?
There was a problem hiding this comment.
I think it would still need to be guarded - because for FA there are no parallel slices, and thus we need to use the return at the end of the function.
This allows to replace
f.yup()[i.yp()]withf[i.yp()].That should allow to simplify some code, and will make more code fci-correct.