Skip to content

Comments

Replace xtensor with internal Tensor/View classes#3805

Merged
paulromano merged 53 commits intoopenmc-dev:developfrom
jtramm:no_xtensor_claude
Feb 17, 2026
Merged

Replace xtensor with internal Tensor/View classes#3805
paulromano merged 53 commits intoopenmc-dev:developfrom
jtramm:no_xtensor_claude

Conversation

@jtramm
Copy link
Contributor

@jtramm jtramm commented Feb 12, 2026

Motivation

OpenMC currently depends on the xtensor and xtl header-only libraries for multi-dimensional array support. While xtensor was originally a reliable and stable header library, recently the dependency has caused us a number of ongoing build issues (particularly with LLVM, e.g. #3183). It has also undergone some drastic API changes (xtensor-stack/xtensor#2829) that break compatibility and cause ongoing headaches. This is made worse as sometimes cmake will find other versions of xtensor people already have in their environment that will conflict with the newest API changes, so the pain continues even after we update to use the different API. An additional problem is that xtensor is not GPU-friendly, so would cause problems to leave in place as we are working on adding GPU offloading capabilities to the main branch of OpenMC.

For the above reasons, many have wanted to remove the xtensor dependency and replace it with our own "home cooked" version. Notably, OpenMC is only a relatively pedestrian user of xtensor and doesn't utilize all its features, so our own implementation can be much simplified as compared to xtensor. With AI coding tools (I used Claude Code CLI with Opus 4.6), this type of refactoring task is quite straightforward so was a lot more approachable than it has been to date.

Strategy

We have at least three options open to us in terms of how to approach the removal. A few options I considered:

  1. Direct like-for-like replacement of the xtensor API, such that essentially nothing in the OpenMC .cpp files needed to change except for what header to include. This worked, but the resulting class had to carry forward a lot of heavyweight abstractions that we likely didn't need for our use cases and would bloat the class up significantly. If you're interested in how this looks, you can checkout earlier commits on this project as this was the route I first tried.

  2. A set of lighter weight tensor and view classes to accomplish the basic things that OpenMC needed without going overboard. Capabilities like broadcasting are dropped in favor of a few extra for loops here and there in the .cpp files, with the savings that the tensor classes can be much simpler.

  3. (Suggested by Paul Romano) There is a community proposal to add a std::mdarray to the C++ standard (perhaps for C++29?) that would serve as the container, along with the C++23 std::mdspan class to be used as the view. I haven't tried coding this route up, but might be nice as in the long term this would be the interface C++ developers would be used to. The downside is that it's not formalized yet (and even once accepted, would be a long time until C++ compilers actually implement it). Additionally - the C++ STL is not supported on GPU, so it would be unlikely we could ever replace our home cooked version with the real STL. My read is also that the std::mdarray semantics may be heavier weight than what we need, but if people are really interested in this route I would be happy to code it up for comparison.

I ended up pursuing option (2). The goal was to find the right level of abstraction: enough structure that multi-dimensional array code reads cleanly, but not so much machinery that the implementation itself becomes hard to understand. I also wanted to avoid more advanced C++ features where possible to ensure that the class works fine with GPU compilers down the line. With that in mind, I am still open to ideas. I generally tried to keep the class as slim as possible, but it may be that folks would like to use a different syntax, or perhaps add some extra capabilities that are not yet in it. Feel free to chime in.

Design

Types

There are three types, down from xtensor's eight-plus (xtensor, xarray, xadapt, xview, xfunction, xscalar, xtensor_fixed, ...):

Tensor<T> — the primary workhorse. A dynamic-rank owning container that stores elements contiguously in row-major order using openmc::vector<storage_type<T>>. Shape is stored as a vector<size_t>. This single type replaces both xt::xtensor<T, N> (fixed compile-time rank) and xt::xarray<T> (dynamic rank). Notably, the tensor is fully dynamic. Comparatively, xtensor allowed for the rank to be a template parameter (even though it was still dynamically sized). My feeling was that you still have to dynamically allocate the data storage, so saving an indirection/allocation on the shape was not that big of a savings, and complicates things a lot.

View<T> — a lightweight non-owning reference into a Tensor's storage. Returned by the slice() method and flat(). Holds a pointer, shape, and strides (to support non-contiguous views like columns). Supports assignment, compound arithmetic, iteration, and a sum() reduction — just enough to cover OpenMC's view usage patterns.

StaticTensor2D<T, R, C> — a minimal stack-allocated 2D array used only for simulation::global_tallies (a 3x3 matrix of doubles). Uses a plain C array internally. This replaces xt::xtensor_fixed with its xt::xshape<> template machinery. This is only used in one spot (for the global scalar tallies). Questionable as to if we want to include this. An alternative might be to just axe it and make a small class to hold the global tally data and provide accessors. On the flip side, I'm wondering if it might have some utility for the GPU effort give that it is static, so may find other uses potentially.

Key API choices

A single variadic slice() method instead of a free view() function. Where xtensor uses xt::view(arr, i, xt::all(), xt::range(a, b)) with positional sentinel arguments, we use arr.slice(i, all, range(a, b)) — a member function where each argument corresponds to one axis. Arguments can be a plain integer (fixes that axis, rank-reducing), all (keeps entire axis), or range(start, end) (keeps a sub-range). Trailing axes not listed are implicitly all, so arr.slice(i) on a 2D tensor returns row i (matching numpy's arr[i]). arr.flat() provides a flat 1D view of all elements.

Reductions are member functions returning scalars. arr.sum() returns a T, not a proxy object that requires () to evaluate. arr.sum(axis) returns a lower-rank Tensor<T>. No lazy evaluation or proxy types as I think xtensor was using.

Element-wise math functions are non-member functions in the tensor namespace: tensor::log(), tensor::abs(), tensor::where(), etc. These follow the pattern of std:: math functions and are easy to extend — adding a new one is just a few lines following the existing pattern.

Tensor<bool> uses unsigned char storage. std::vector<bool> is a notoriously broken specialization that returns proxy objects instead of real references. A storage_type trait maps bool to unsigned char so that Tensor<bool> has normal reference semantics. We don't use any bool tensors currently, but adding the checks here would be useful as mapping vectors of bools to GPU is difficult if using bitfields.

The Tensor(View) constructor is explicit. Since Views are non-owning references, implicitly converting one to a Tensor would silently allocate and copy data. Making the constructor explicit forces call sites to be clear about when they intend to materialize a view into an owned tensor.

What's not included

We deliberately excluded things xtensor provides that OpenMC doesn't need:

  • Expression templates and lazy evaluation
  • Compile-time rank as a template parameter (could in theory impact performance, so I did a little testing and prove performance data below)
  • Broadcasting

This keeps tensor.h small, understandable, and maintainable. If a new operation is needed in the future, the patterns for adding one are straightforward — element-wise functions are ~6 lines each, reductions are ~10.

Migration patterns

The mechanical changes across the codebase follow a small number of patterns:

xtensor tensor.h
xt::xtensor<T, N> tensor::Tensor<T>
xt::xarray<T> tensor::Tensor<T>
xt::view(arr, i, xt::all()) arr.slice(i)
xt::view(arr, xt::all(), j) arr.slice(all, j)
xt::view(arr, i, xt::range(a, b)) arr.slice(i, range(a, b))
xt::view(arr, xt::range(a, b)) arr.slice(range(a, b))
xt::sum(arr)() arr.sum()
xt::sum(arr, {axis}) arr.sum(axis)
xt::zeros<T>(shape) tensor::zeros<T>(shape)
xt::adapt(ptr, n, tag, shape) Tensor<T>(ptr, n) or allocate directly
xt::empty<T>(shape) Tensor<T>(shape)
.dimension() .ndim()

More CTest Unit Tests

A new tests/cpp_unit_tests/test_tensor.cpp exercises all tensor.h functionality with 62 Catch2 test cases covering constructors, indexing across ranks, assignment, mutation, slicing (single-axis, multi-axis, ranges, all), flat views, reductions, operators, mixed-type arithmetic, non-member functions, StaticTensor2D, const correctness, and the is_tensor type trait.

Build impact

  • xtensor and xtl submodules are no longer needed
  • cmake no longer searches for or configures xtensor
  • brew install xtensor is no longer required on macOS
  • No new external dependencies are introduced
  • Serial build time (for default build with tests etc) on my laptop is reduced from 133 seconds to 112 seconds. This is due to the new template header being much simpler (which is key as its included in dozens of files).

(Optional) Upgrading an existing clone

If you have an existing clone with initialized xtensor/xtl submodules, pulling this branch will produce harmless warnings:

warning: unable to rmdir 'vendor/xtensor': Directory not empty
warning: unable to rmdir 'vendor/xtl': Directory not empty

The build will work fine despite these warnings — CMake no longer looks for xtensor. But the leftover directories and cached git metadata are unnecessary clutter. If you want to clean them up, you can do this from the openmc top directory:

# Remove the leftover working tree directories
rm -rf vendor/xtensor vendor/xtl

# Remove the cached submodule data
rm -rf .git/modules/vendor/xtensor .git/modules/vendor/xtl

# Remove the submodule entries from your local git config
git config --remove-section submodule.vendor/xtensor
git config --remove-section submodule.vendor/xtl

Note: git submodule deinit does not work here because the submodule entries have already been removed from .gitmodules by this PR. The manual steps above are required.

A fresh clone (git clone --recurse-submodules) will not have this issue — it will simply never initialize xtensor or xtl since they are no longer listed in .gitmodules.

Fixes #3652, #3432, #3183. Closes #3793.

Checklist

  • I have performed a self-review of my own code
  • I have run clang-format (version 15) on any C++ source files (if applicable)
  • I have followed the style guidelines for Python source files (if applicable)
  • I have made corresponding changes to the documentation (if applicable)
  • I have added tests that prove my fix is effective or that my feature works (if applicable)

jtramm added 30 commits February 9, 2026 14:06
…ors etc together in a more organized manner.
@jtramm
Copy link
Contributor Author

jtramm commented Feb 12, 2026

Forgot to add the performance results: I ran the HM large benchmark on my laptop and didn't see any difference in inactive or active tracking rates. If anything, the new feature branch seemed a little faster, but that may have just been noise.

Copy link
Contributor

@paulromano paulromano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing! It's really cool to see this come to fruition almost a year after we had initially started talking about this being possible with AI tools. I'm starting to go through this in full but one immediate question is on the tensor namespace. Was there a reason you introduced a namespace specifically for this class? We haven't done that with other classes for the most part.

@jtramm
Copy link
Contributor Author

jtramm commented Feb 13, 2026

Great question! Yeah, this is partially vestigial from how xtensor had originally set things up, and partially useful I think. There are some functions like log(my_tensor) that read a little nicer than my_tensor.log(), but perhaps the latter could be preferable. Also there are the range and "all" classes that we have for slicing operations. These are sort of awkward to have floating around in the larger openmc namespace.

That said, we could always try it as just the three classes in the top level namespace and see how it looks if you like. Pretty easy to try it out with the agent!

Copy link
Contributor

@paulromano paulromano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done with this PR @jtramm! As you said in the description, I think this hits just the right level of abstraction and will help us with our GPU efforts down the line. The implementation is very clean and the Tensor/View classes are easy to understand (unlike xtensor which I found difficult to wade through). From a design standpoint, I think the only downside in my mind is that the dimension for tensors is not explicit, so it requires a little more thought from a reader. If we make it a habit to include comments where we're declaring tensors to include the number of dimensions / shape, that would help.

Thinking a little more about the namespacing, it is definitely desirable to have tensor::all and tensor::range for clarity. tensor::View is also nice because just plain View would be a bit vague. So given that, maybe it does make sense to just keep it as tensor::Tensor... or we can bring just that class into the main openmc namespace for brevity. Your call!

I suspect given the number of source files changed, this will create some conflicts with existing PRs but we'll just have to work through those 🤷‍♂️

A few very minor comments below for you to consider but I'll mark this as approved and will plan on merging early next week to give others a chance to look at it if they wish.

@paulromano paulromano changed the title Remove xtensor and xtl Dependencies Replace xtensor with internal Tensor/View classes Feb 13, 2026
@paulromano paulromano merged commit 977ade7 into openmc-dev:develop Feb 17, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants