diff --git a/CLAUDE.md b/CLAUDE.md
index 7a3a5356..72b2945f 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -220,11 +220,20 @@ Underworld development team with AI support from Claude Code
 - Requires complete rebuild (~1 hour) if relocated
 
 ### Rebuild After Source Changes
-**After modifying source files, always run `pixi run underworld-build`!**
+**After modifying source files, always run `./uw build`!**
 - Underworld3 is installed as a package in the pixi environment
 - Changes go to `.pixi/envs/default/lib/python3.12/site-packages/underworld3/`
 - Verify with `uw.model.__file__`
 
+**⚠️ STALE BUILD CACHE**: If `./uw build` succeeds but Python still uses old code
+(e.g. a new parameter is "unknown"), pip's wheel cache is stale. Fix with:
+```bash
+rm -rf build/lib.* build/bdist.*
+pixi run -e default pip install --no-build-isolation --force-reinstall --no-deps .
+```
+This is the most common build issue — `./uw build` reuses cached wheels when the
+version number hasn't changed. Always verify changes are installed before debugging.
+
 ### Test Quality Principles
 **New tests must be validated before making code changes to fix them!**
 - Validate test correctness before changing main code
@@ -534,10 +543,10 @@ When working on specific subsystems, these documents provide detailed guidance.
 
 ## Quick Reference
 
-### Pixi Commands
+### Build & Test Commands
 ```bash
-pixi run underworld-build    # Rebuild after source changes
-pixi run underworld-test     # Run test suite
+./uw build                   # Rebuild after source changes (preferred)
+./uw test                    # Run test suite
 pixi run -e default python   # Run Python in environment
 ```
 
diff --git a/docs/beginner/create_xdmf.md b/docs/beginner/create_xdmf.md
index f5f141bc..b01f1099 100644
--- a/docs/beginner/create_xdmf.md
+++ b/docs/beginner/create_xdmf.md
@@ -4,7 +4,7 @@ title: "Create XDMF from PETSc HDF5 Output"
 
 # Create XDMF from PETSc HDF5 Output
 
-This guide explains how to generate ParaView-ready XDMF/HDF5 outputs with `mesh.write_timestep(create_xdmf=True)`, why this is needed with newer PETSc HDF5 layouts, and how tensors are converted for correct ParaView interpretation.
+This guide explains the ParaView-ready XDMF/HDF5 outputs generated by `mesh.write_timestep()` (enabled by default via `create_xdmf=True`), why this is needed with newer PETSc HDF5 layouts, and how tensors are converted for correct ParaView interpretation.
 
 This note explains the output-format issue seen with newer PETSc versions and the fix now available in `Mesh.write_timestep()`.
 
@@ -79,13 +79,13 @@ So, `/fields` is useful raw data, but it is not always directly visualization-re
 
 ## 3. What changed in `write_timestep()` and why
 
-`Mesh.write_timestep()` now supports:
+`Mesh.write_timestep()` generates XDMF by default (`create_xdmf=True`):
 
 ```python
-mesh.write_timestep(..., create_xdmf=True)
+mesh.write_timestep(...)  # XDMF enabled by default
 ```
 
-When `create_xdmf=True`:
+When `create_xdmf=True` (the default):
 
 1. It writes compatibility datasets per variable:
    - node-like vars -> `/vertex_fields/coordinates` and `/vertex_fields/<name>_<name>`
@@ -120,27 +120,44 @@ Then:
 - `is_cell == True`  -> write `/cell_fields/<name>_<name>` and XDMF `Center="Cell"`
 - `is_cell == False` -> write `/vertex_fields/<name>_<name>` and XDMF `Center="Node"`
 
-### How `/fields` data is remapped (KDTree)
+### How vertex data is obtained
 
-When a variable is vertex-centered but `/fields` does not already match mesh vertex count:
+The compatibility group writer uses PETSc's `createInterpolation` to project
+data to mesh vertices:
 
-1. Source coordinates/values are read from `/fields/coordinates` and `/fields/<name>`.
-2. Packed high-order layouts are unpacked into point-wise coordinate/value rows.
-3. A nearest-neighbor search is done with `uw.kdtree.KDTree` to map source points to mesh vertices.
-4. The mapped values are written to `/vertex_fields/<name>_<name>`.
+- **P1 continuous variables**: DOFs match mesh vertices exactly, so the owned
+  global-vector data is written directly to `/vertex_fields/` — no interpolation needed.
+- **P2+ continuous variables**: a scratch DM at degree 1 is created, PETSc builds
+  the interpolation matrix, and the projected global vector is written. This gives
+  exact polynomial-consistent values at vertex positions.
+- **Cell (discontinuous / degree-0) variables**: owned global-vector data is written
+  directly to `/cell_fields/`.
 
-For cell-centered variables, values are written into `/cell_fields/<name>_<name>` using row slices.
+### Tensor repacking
 
-### Parallel execution details
+Tensor variables are repacked from UW3's internal `_data_layout` storage order to
+ParaView's 9-component row-major 3x3 format before writing to the compatibility
+groups.  The checkpoint data in `/fields/` is always stored in UW3's native ordering.
+
+UW3 `_data_layout` storage (different from ParaView's expected order):
 
-This remapping/writing path is parallel-aware:
+| Type | 2D components | 3D components |
+|------|--------------|--------------|
+| TENSOR | `[xx, xy, yx, yy]` | `[xx, xy, xz, yx, yy, yz, zx, zy, zz]` (same as ParaView) |
+| SYM_TENSOR | `[xx, yy, xy]` | `[xx, yy, zz, xy, xz, yz]` |
 
-- work is partitioned by MPI rank (each rank owns a slice of vertices / rows)
-- each rank computes mapping for its local slice
-- output datasets are written by per-rank slices (no overlapping writes)
-- with MPI-enabled `h5py`, HDF5 writes use parallel I/O; otherwise a serial fallback is used
+```{note}
+The original PR #69 assumed 2D TENSOR stored `[xx, yy, xy, yx]`.
+The corrected implementation uses the actual `_data_layout` row-major
+ordering `[xx, xy, yx, yy]`.
+```
+
+### Parallel execution details
 
-So the compatibility-field generation is computed and written in parallel when MPI-HDF5 is available.
+All I/O uses PETSc's parallel `ViewerHDF5`.  Data is written as standalone Vecs
+(not DM-associated) with `setBlockSize()` to preserve the `(N, ncomp)` dataset
+shape.  This is necessary because DM-associated Vecs ignore `pushGroup()` — the
+DMPlex HDF5 writer internally redirects to `/fields/`.
 
 ## 4. Tensor representation for ParaView (2D and 3D)
 
diff --git a/src/underworld3/discretisation/discretisation_mesh.py b/src/underworld3/discretisation/discretisation_mesh.py
index a8e16e58..925bdab9 100644
--- a/src/underworld3/discretisation/discretisation_mesh.py
+++ b/src/underworld3/discretisation/discretisation_mesh.py
@@ -1739,8 +1739,7 @@ def write_timestep(
         meshVars: Optional[list] = [],
         swarmVars: Optional[list] = [],
         meshUpdates: bool = False,
-        create_xdmf: bool = False,
-        vertex_cell_chunk_size: Optional[int] = None,
+        create_xdmf: bool = True,
     ):
         """
         Write mesh and selected variables for visualisation output.
@@ -1751,16 +1750,9 @@ def write_timestep(
         - optional proxy files for swarm variables
         - optional XDMF file linking all output files
 
-        When ``create_xdmf=True``, variable files also include
-        compatibility groups:
-        - ``/vertex_fields`` for node-like variables
-        - ``/cell_fields`` for cell-like variables
-
-        and ``checkpoint_xdmf()`` is called on rank 0.
-
-        ``vertex_cell_chunk_size`` controls source-point chunking for
-        node-like remapping. Default is automatic. Set a positive integer
-        to manually override the auto-chosen chunk size.
+        When ``create_xdmf=True`` (the default), variable files also include
+        ParaView-compatible groups (``/vertex_fields`` or ``/cell_fields``),
+        and an XDMF file is generated on rank 0.
 
         """
         options = PETSc.Options()
@@ -1803,15 +1795,7 @@ def write_timestep(
                 save_location = output_base_name + f".mesh.{var.clean_name}.{index:05}.h5"
                 var.write(save_location)
                 if create_xdmf:
-                    is_cell = (not getattr(var, "continuous")) or (getattr(var, "degree") == 0)
-                    _write_vertex_cell_groups(
-                        mesh_h5=mesh_file,
-                        var_h5=save_location,
-                        var_name=var.clean_name,
-                        var_obj=var,
-                        is_cell=is_cell,
-                        vertex_cell_chunk_size=vertex_cell_chunk_size,
-                    )
+                    _write_compat_groups(self, var, save_location)
 
         if swarmVars is not None:
             for svar in swarmVars:
@@ -3136,272 +3120,55 @@ def mesh_update_callback(array, change_context):
 ## Simplified to allow us to decide how we want to checkpoint
 
 
-def _write_vertex_cell_groups(
-    mesh_h5: str,
-    var_h5: str,
-    var_name: str,
-    var_obj=None,
-    is_cell: bool = False,
-    vertex_cell_chunk_size: Optional[int] = None,
-):
-    """
-    Write vertex/cell compatibility groups into a variable file.
+def _write_compat_groups(mesh, var, var_h5_path):
+    """Write ``/vertex_fields/`` or ``/cell_fields/`` compatibility groups.
 
-    Source layout:
-    - ``/fields/<var_name>``
-    - ``/fields/coordinates`` (for node-style mapping)
+    Uses ``uw.function.write_vertices_to_viewer`` (PETSc interpolation +
+    ViewerHDF5) for continuous variables, and
+    ``uw.function.write_cell_field_to_viewer`` for cell/DG-0 variables.
+    PETSc handles all parallel I/O natively.
 
-    Output layout:
-    - cell-like variables -> ``/cell_fields/<var_name>_<var_name>``
-    - node-like variables -> ``/vertex_fields/coordinates`` and
-      ``/vertex_fields/<var_name>_<var_name>``
+    Vertex coordinates are also written to ``/vertex_fields/coordinates``
+    for XDMF compatibility.
 
-    Tensor-like node variables are written as flattened 3x3 tensors
-    (N, 9) in ``/vertex_fields/<var_name>_<var_name>`` for ParaView
-    compatibility.
+    Parameters
+    ----------
+    mesh : Mesh
+        The parent mesh.
+    var : MeshVariable
+        The variable whose data has already been written to *var_h5_path*
+        by ``var.write()`` (so ``var._gvec`` is up-to-date).
+    var_h5_path : str
+        Path to the HDF5 file (already contains ``/fields/<name>``).
     """
-
-    import h5py
-    import numpy as np
     import underworld3 as uw
 
-    comm = uw.mpi.comm
-    rank = uw.mpi.rank
-    size = uw.mpi.size
-
-    def _chunk_bounds(total_count, proc_rank, proc_size):
-        base = total_count // proc_size
-        rem = total_count % proc_size
-        start = proc_rank * base + min(proc_rank, rem)
-        stop = start + base + (1 if proc_rank < rem else 0)
-        return start, stop
-
-    def _auto_chunk_size(
-        total_source_rows: int,
-        coordinate_columns: int,
-        field_columns: int,
-        dtype_itemsize: int,
-    ) -> int:
-        # Approximate per-rank transient memory for source coordinates + values.
-        target_bytes_per_rank = 192 * 1024 * 1024
-        bytes_per_source_row = max(
-            dtype_itemsize * max(1, coordinate_columns + field_columns), 8
-        )
-        chunk_from_memory = max(10_000, target_bytes_per_rank // bytes_per_source_row)
+    is_cell = (not var.continuous) or (var.degree == 0)
+    group = "cell_fields" if is_cell else "vertex_fields"
 
-        # Keep enough chunks to spread work and reduce tail effects as rank count grows.
-        chunk_from_ranks = max(
-            10_000, (total_source_rows + (size * 8) - 1) // max(size * 8, 1)
-        )
-
-        return int(min(total_source_rows, max(10_000, min(chunk_from_memory, chunk_from_ranks))))
-
-    def _flatten_source_points(source_coordinate_chunk, source_field_chunk, dim):
-        if source_coordinate_chunk.ndim != 2:
-            raise RuntimeError(
-                f"Unsupported coordinate rank for {var_name}: {source_coordinate_chunk.ndim}"
-            )
-
-        if source_coordinate_chunk.shape[1] == dim:
-            coords = source_coordinate_chunk
-            values = source_field_chunk
-            if values.ndim == 1:
-                values = values.reshape(-1, 1)
-            return coords, values
-
-        if source_coordinate_chunk.shape[1] % dim != 0:
-            raise RuntimeError(
-                f"Coordinate dimension mismatch: mesh={dim}, field={source_coordinate_chunk.shape[1]}"
-            )
-
-        dof_per_row = source_coordinate_chunk.shape[1] // dim
-        coords = source_coordinate_chunk.reshape(-1, dim)
-        values = source_field_chunk
-
-        if values.ndim == 1:
-            values = values.reshape(-1, 1)
-            values = np.repeat(values, dof_per_row, axis=0)
-            return coords, values
-
-        if values.shape[1] == dof_per_row:
-            values = values.reshape(-1, 1)
-            return coords, values
-
-        if values.shape[1] % dof_per_row == 0:
-            ncomp_eff = values.shape[1] // dof_per_row
-            values = values.reshape(values.shape[0], dof_per_row, ncomp_eff).reshape(-1, ncomp_eff)
-            return coords, values
+    # Some PETSc versions (3.21+) write /vertex_fields/ or /cell_fields/
+    # automatically during var.write().  Remove any pre-existing group so
+    # that our compat writer can create it afresh (otherwise PETSc error 76
+    # on duplicate dataset).
+    import h5py
 
-        raise RuntimeError(
-            f"Cannot unpack field layout for {var_name}: "
-            f"values={values.shape}, coords={source_coordinate_chunk.shape}, dim={dim}"
-        )
+    if uw.mpi.rank == 0:
+        with h5py.File(var_h5_path, "a") as f:
+            if group in f:
+                del f[group]
+    uw.mpi.barrier()
 
-    is_tensor = False
-    if var_obj is not None and hasattr(var_obj, "vtype"):
-        is_tensor = var_obj.vtype in (
-            uw.VarType.TENSOR,
-            uw.VarType.SYM_TENSOR,
-            uw.VarType.MATRIX,
-        )
+    viewer = PETSc.ViewerHDF5().create(
+        var_h5_path, "a", comm=PETSc.COMM_WORLD,
+    )
 
-    parallel_h5 = bool(getattr(h5py.get_config(), "mpi", False))
-    if parallel_h5:
-        open_kwargs = {"driver": "mpio", "comm": comm}
+    if is_cell:
+        uw.function.write_cell_field_to_viewer(var, viewer)
     else:
-        # Fallback: single-writer path when MPI-HDF5 is unavailable.
-        if rank != 0:
-            return
-        open_kwargs = {}
-
-    with h5py.File(mesh_h5, "r", **open_kwargs) as mesh_file:
-        mesh_vertices_dataset = mesh_file["geometry/vertices"]
-        num_vertices = mesh_vertices_dataset.shape[0]
-        dim = mesh_vertices_dataset.shape[1]
-        local_vertex_start, local_vertex_stop = _chunk_bounds(num_vertices, rank, size)
-        local_vertices = mesh_vertices_dataset[local_vertex_start:local_vertex_stop, :]
-
-    with h5py.File(var_h5, "r+", **open_kwargs) as variable_file:
-        if "fields" not in variable_file:
-            raise RuntimeError(f"Missing group '/fields' in {var_h5}")
-        if var_name not in variable_file["fields"]:
-            raise RuntimeError(f"Missing dataset '/fields/{var_name}' in {var_h5}")
-
-        field_dataset = variable_file["fields"][var_name]
-        field_dataset_dtype = field_dataset.dtype
-        output_dataset_name = f"{var_name}_{var_name}"
-
-        if is_cell:
-            num_rows = field_dataset.shape[0]
-            local_row_start, local_row_stop = _chunk_bounds(num_rows, rank, size)
-            local_field_chunk = field_dataset[local_row_start:local_row_stop]
-            if local_field_chunk.ndim == 1:
-                local_field_chunk = local_field_chunk.reshape(-1, 1)
-            output_components = local_field_chunk.shape[1]
-
-            cell_group = variable_file.require_group("cell_fields")
-            if rank == 0 and output_dataset_name in cell_group:
-                del cell_group[output_dataset_name]
-            if parallel_h5:
-                comm.Barrier()
-            output_dataset = cell_group.require_dataset(
-                output_dataset_name, shape=(num_rows, output_components), dtype=field_dataset_dtype
-            )
-            output_dataset[local_row_start:local_row_stop, :] = local_field_chunk
-            if parallel_h5:
-                comm.Barrier()
-            return
-
-        if "coordinates" not in variable_file["fields"]:
-            raise RuntimeError(f"Missing dataset '/fields/coordinates' in {var_h5}")
-
-        coordinate_dataset = variable_file["fields"]["coordinates"]
-        num_source_rows = coordinate_dataset.shape[0]
-        coordinate_columns = coordinate_dataset.shape[1]
-        if field_dataset.ndim == 1:
-            field_columns = 1
-        else:
-            field_columns = field_dataset.shape[1]
-
-        if vertex_cell_chunk_size is None or vertex_cell_chunk_size <= 0:
-            effective_chunk_size = _auto_chunk_size(
-                total_source_rows=num_source_rows,
-                coordinate_columns=coordinate_columns,
-                field_columns=field_columns,
-                dtype_itemsize=field_dataset_dtype.itemsize,
-            )
-        else:
-            effective_chunk_size = int(vertex_cell_chunk_size)
-
-        local_vertex_count = max(local_vertex_stop - local_vertex_start, 0)
-        local_best_distance_sq = np.full(local_vertex_count, np.inf, dtype=np.float64)
-        local_best_values = None
-
-        for source_start in range(0, num_source_rows, effective_chunk_size):
-            source_stop = min(source_start + effective_chunk_size, num_source_rows)
-            source_coordinates = coordinate_dataset[source_start:source_stop, :]
-            source_values = field_dataset[source_start:source_stop]
-            source_coordinates, source_values = _flatten_source_points(
-                source_coordinates, source_values, dim
-            )
-
-            if local_best_values is None:
-                local_best_values = np.zeros(
-                    (local_vertex_count, source_values.shape[1]), dtype=field_dataset_dtype
-                )
-
-            source_tree = uw.kdtree.KDTree(source_coordinates)
-            nearest_distance, nearest_index = source_tree.query(
-                local_vertices, k=1, sqr_dists=False
-            )
-            nearest_distance = np.asarray(nearest_distance).reshape(-1)
-            nearest_index = np.asarray(nearest_index).reshape(-1)
-            nearest_distance_sq = nearest_distance * nearest_distance
-            improved = nearest_distance_sq < local_best_distance_sq
-
-            if np.any(improved):
-                local_best_distance_sq[improved] = nearest_distance_sq[improved]
-                local_best_values[improved, :] = source_values[nearest_index[improved], :]
-
-        if local_best_values is None:
-            # Degenerate case: empty source
-            local_best_values = np.zeros((local_vertex_count, 1), dtype=field_dataset_dtype)
-
-        if is_tensor:
-            local_component_count = local_best_values.shape[1]
-            tensor_values = np.zeros((local_vertex_count, 9), dtype=field_dataset_dtype)
-
-            if dim == 2 and local_component_count == 4:
-                tensor_values[:, 0] = local_best_values[:, 0]
-                tensor_values[:, 4] = local_best_values[:, 1]
-                tensor_values[:, 1] = local_best_values[:, 2]
-                tensor_values[:, 3] = local_best_values[:, 3]
-            elif dim == 2 and local_component_count == 3:
-                tensor_values[:, 0] = local_best_values[:, 0]
-                tensor_values[:, 4] = local_best_values[:, 1]
-                tensor_values[:, 1] = local_best_values[:, 2]
-                tensor_values[:, 3] = local_best_values[:, 2]
-            elif dim == 3 and local_component_count == 9:
-                tensor_values[:, :] = local_best_values[:, :]
-            elif dim == 3 and local_component_count == 6:
-                tensor_values[:, 0] = local_best_values[:, 0]
-                tensor_values[:, 4] = local_best_values[:, 1]
-                tensor_values[:, 8] = local_best_values[:, 2]
-                tensor_values[:, 1] = local_best_values[:, 3]
-                tensor_values[:, 3] = local_best_values[:, 3]
-                tensor_values[:, 5] = local_best_values[:, 4]
-                tensor_values[:, 7] = local_best_values[:, 4]
-                tensor_values[:, 2] = local_best_values[:, 5]
-                tensor_values[:, 6] = local_best_values[:, 5]
-            else:
-                tensor_values = None
-
-            if tensor_values is not None:
-                local_best_values = tensor_values
-
-        output_components = local_best_values.shape[1]
-        vertex_group = variable_file.require_group("vertex_fields")
-        if rank == 0:
-            if "coordinates" in vertex_group:
-                del vertex_group["coordinates"]
-            if output_dataset_name in vertex_group:
-                del vertex_group[output_dataset_name]
-        if parallel_h5:
-            comm.Barrier()
-
-        coordinates_output = vertex_group.require_dataset(
-            "coordinates", shape=(num_vertices, dim), dtype=local_vertices.dtype
-        )
-        values_output = vertex_group.require_dataset(
-            output_dataset_name,
-            shape=(num_vertices, output_components),
-            dtype=field_dataset_dtype,
-        )
-        coordinates_output[local_vertex_start:local_vertex_stop, :] = local_vertices
-        values_output[local_vertex_start:local_vertex_stop, :] = local_best_values
+        uw.function.write_vertices_to_viewer(var, viewer)
+        uw.function.write_coordinates_to_viewer(mesh, viewer)
 
-        if parallel_h5:
-            comm.Barrier()
+    viewer.destroy()
 
 
 def checkpoint_xdmf(
diff --git a/src/underworld3/function/__init__.py b/src/underworld3/function/__init__.py
index eb09d37e..2f39ff2c 100644
--- a/src/underworld3/function/__init__.py
+++ b/src/underworld3/function/__init__.py
@@ -85,6 +85,15 @@
     interpolate_gradients_at_coords,
 )
 
+# Field projection utilities (scratch-DM based, no MeshVariable creation)
+from .field_projection import (
+    project_to_degree,
+    project_to_vertices,
+    write_vertices_to_viewer,
+    write_coordinates_to_viewer,
+    write_cell_field_to_viewer,
+)
+
 
 def with_units(sympy_expr, name=None, units=None):
     """
diff --git a/src/underworld3/function/field_projection.py b/src/underworld3/function/field_projection.py
new file mode 100644
index 00000000..28fc4809
--- /dev/null
+++ b/src/underworld3/function/field_projection.py
@@ -0,0 +1,352 @@
+"""Project or prolong MeshVariable data to a different polynomial degree.
+
+Uses a scratch PETSc DM with ``createInterpolation`` — no MeshVariable
+is created, the mesh DM is never modified, and all scratch objects are
+destroyed before returning.
+
+Typical use cases
+-----------------
+* Down-sample a P2 field to P1 vertex values for visualisation / XDMF output.
+* Prolong a P1 field to P2 DOF values for initialisation.
+* Obtain vertex values (degree-1) from any higher-order variable.
+* Write vertex values directly to an HDF5 file via PETSc ViewerHDF5.
+"""
+
+from __future__ import annotations
+
+from typing import TYPE_CHECKING
+
+import numpy as np
+from petsc4py import PETSc
+
+if TYPE_CHECKING:
+    from underworld3.discretisation import MeshVariable
+
+
+def project_to_degree(
+    mesh_var: "MeshVariable",
+    target_degree: int = 1,
+    continuous: bool = True,
+    include_ghosts: bool = True,
+) -> np.ndarray:
+    """Project a MeshVariable to a different polynomial degree.
+
+    Parameters
+    ----------
+    mesh_var
+        Source MeshVariable (any degree, scalar/vector/tensor).
+    target_degree
+        Polynomial degree of the target space (default 1 = vertex values).
+    continuous
+        Whether the target space is continuous (default ``True``).
+    include_ghosts
+        If ``True`` (default), return the full local vector including ghost
+        DOFs.  If ``False``, return only the owned partition (suitable for
+        parallel HDF5 writing where each rank writes its own slice).
+
+    Returns
+    -------
+    np.ndarray
+        Projected values with shape ``(n_dofs, num_components)``.
+
+    Notes
+    -----
+    This creates a transient scratch DM, builds the PETSc interpolation
+    matrix, applies it, and destroys everything.  The mesh DM and all
+    existing MeshVariables are completely untouched.
+
+    For ``target_degree == mesh_var.degree`` the interpolation matrix is
+    the identity and the result matches the source data exactly.
+    """
+
+    mesh = mesh_var.mesh
+    nc = mesh_var.num_components
+
+    # --- scratch DM with a single field at the target degree ---
+    options = PETSc.Options()
+    prefix = "_fieldproj_"
+    options.setValue(f"{prefix}petscspace_degree", target_degree)
+    options.setValue(f"{prefix}petscdualspace_lagrange_continuity", continuous)
+    options.setValue(f"{prefix}petscdualspace_lagrange_node_endpoints", False)
+
+    fe_target = PETSc.FE().createDefault(
+        mesh.dim, nc, mesh.isSimplex, mesh.qdegree, prefix, PETSc.COMM_SELF,
+    )
+
+    dm_scratch = mesh.dm.clone()
+    dm_scratch.addField(fe_target)
+    dm_scratch.createDS()
+
+    # --- source sub-DM for this variable's field ---
+    iset, subdm_src = mesh.dm.createSubDM(mesh_var.field_id)
+
+    # --- interpolation matrix ---
+    # PETSc: source_subdm.createInterpolation(target_dm) returns a matrix
+    # whose mult() maps source global vec → target global vec.
+    interp, _scale = subdm_src.createInterpolation(dm_scratch)
+
+    # --- apply ---
+    g_src = subdm_src.createGlobalVec()
+    subdm_src.localToGlobal(mesh_var._lvec, g_src)
+
+    g_dst = dm_scratch.createGlobalVec()
+    interp.mult(g_src, g_dst)
+
+    if include_ghosts:
+        l_dst = dm_scratch.createLocalVec()
+        dm_scratch.globalToLocal(g_dst, l_dst)
+        result = l_dst.array.reshape(-1, nc).copy()
+        l_dst.destroy()
+    else:
+        result = g_dst.array.reshape(-1, nc).copy()
+
+    # --- cleanup ---
+    for obj in (interp, g_src, g_dst, dm_scratch, fe_target, subdm_src, iset):
+        obj.destroy()
+    if _scale is not None:
+        _scale.destroy()
+
+    return result
+
+
+def project_to_vertices(mesh_var: "MeshVariable") -> np.ndarray:
+    """Shorthand: project any MeshVariable to P1 (vertex) values.
+
+    Parameters
+    ----------
+    mesh_var
+        Source MeshVariable.
+
+    Returns
+    -------
+    np.ndarray
+        Values at mesh vertices, shape ``(n_vertices, num_components)``.
+    """
+    return project_to_degree(mesh_var, target_degree=1, continuous=True)
+
+
+def _repack_tensor_to_paraview(data, vtype, dim):
+    """Repack UW3 tensor data to ParaView 9-component (3x3) format.
+
+    The checkpoint path (``/fields/``) stores tensors in UW3's internal
+    ``_data_layout`` ordering.  The visualisation path
+    (``/vertex_fields/``) must repack to ParaView's expected row-major
+    3x3 layout: ``[xx, xy, xz, yx, yy, yz, zx, zy, zz]``.
+
+    UW3 ``_data_layout`` ordering:
+
+    - TENSOR 2D:     ``[xx, xy, yx, yy]``  (row-major, 4 components)
+    - SYM_TENSOR 2D: ``[xx, yy, xy]``       (3 unique components)
+    - TENSOR 3D:     ``[xx, xy, xz, yx, yy, yz, zx, zy, zz]``
+                     (row-major, already ParaView-compatible)
+    - SYM_TENSOR 3D: ``[xx, yy, zz, xy, xz, yz]``  (6 unique components)
+
+    .. note::
+
+       The original PR #69 assumed 2D TENSOR stored ``[xx, yy, xy, yx]``.
+       This corrected version uses the actual ``_data_layout`` row-major
+       ordering ``[xx, xy, yx, yy]``.
+    """
+    import underworld3 as uw
+
+    n = data.shape[0]
+    ncomp = data.shape[1]
+    out = np.zeros((n, 9), dtype=data.dtype)
+
+    if vtype == uw.VarType.TENSOR:
+        if dim == 2 and ncomp == 4:
+            # [xx, xy, yx, yy] → [xx, xy, 0, yx, yy, 0, 0, 0, 0]
+            out[:, 0] = data[:, 0]  # xx
+            out[:, 1] = data[:, 1]  # xy
+            out[:, 3] = data[:, 2]  # yx
+            out[:, 4] = data[:, 3]  # yy
+        elif dim == 3 and ncomp == 9:
+            out[:, :] = data[:, :]
+        else:
+            return data  # unknown layout, pass through
+    elif vtype == uw.VarType.SYM_TENSOR:
+        if dim == 2 and ncomp == 3:
+            # [xx, yy, xy] → [xx, xy, 0, xy, yy, 0, 0, 0, 0]
+            out[:, 0] = data[:, 0]  # xx
+            out[:, 1] = data[:, 2]  # xy
+            out[:, 3] = data[:, 2]  # xy (symmetric)
+            out[:, 4] = data[:, 1]  # yy
+        elif dim == 3 and ncomp == 6:
+            # [xx, yy, zz, xy, xz, yz] → full 3x3
+            out[:, 0] = data[:, 0]  # xx
+            out[:, 4] = data[:, 1]  # yy
+            out[:, 8] = data[:, 2]  # zz
+            out[:, 1] = data[:, 3]  # xy
+            out[:, 3] = data[:, 3]  # yx = xy
+            out[:, 2] = data[:, 4]  # xz
+            out[:, 6] = data[:, 4]  # zx = xz
+            out[:, 5] = data[:, 5]  # yz
+            out[:, 7] = data[:, 5]  # zy = yz
+        else:
+            return data
+    else:
+        # MATRIX or unknown — pass through
+        return data
+
+    return out
+
+
+def _write_vec_to_group(viewer, data_array, name, group, comm):
+    """Write a numpy array as a standalone PETSc Vec under an HDF5 group.
+
+    DM-associated Vecs ignore ``pushGroup`` because the DM's HDF5 writer
+    pushes its own ``/fields/`` prefix.  This helper creates a plain Vec
+    (no DM) so ``pushGroup`` is respected.
+
+    When *data_array* is 2D ``(N, ncomp)``, the Vec block size is set to
+    *ncomp* so the HDF5 dataset is written as ``(N, ncomp)`` rather than
+    flat ``(N*ncomp,)``.
+
+    Parameters
+    ----------
+    viewer
+        An open ``PETSc.ViewerHDF5``.
+    data_array : np.ndarray
+        Local (owned) data to write — shape ``(n_local,)`` or
+        ``(n_local, ncomp)``.
+    name : str
+        Dataset name in the HDF5 group.
+    group : str
+        HDF5 group path (e.g. ``/vertex_fields``).
+    comm
+        MPI communicator.
+    """
+    vec = PETSc.Vec().createWithArray(data_array.ravel(), comm=comm)
+    if data_array.ndim == 2 and data_array.shape[1] > 1:
+        vec.setBlockSize(data_array.shape[1])
+    vec.setName(name)
+    viewer.pushGroup(group)
+    viewer(vec)
+    viewer.popGroup()
+    vec.destroy()
+
+
+def write_vertices_to_viewer(
+    mesh_var: "MeshVariable",
+    viewer: "PETSc.ViewerHDF5",
+    group: str = "/vertex_fields",
+    name: str | None = None,
+) -> None:
+    """Project a MeshVariable to P1 vertex values and write via PETSc ViewerHDF5.
+
+    For P1 continuous variables, the existing global vector data is
+    written directly (DOFs already match mesh vertices).  For P2+
+    continuous variables, a scratch DM and PETSc interpolation matrix
+    project to degree 1.
+
+    Tensor variables (TENSOR, SYM_TENSOR) are repacked from UW3's
+    internal ``_data_layout`` ordering to ParaView's 9-component
+    row-major 3x3 format.  The checkpoint data in ``/fields/`` is
+    unchanged — only the visualisation copy is repacked.
+
+    Data is written as a standalone Vec (no DM) so that ``pushGroup``
+    is respected — DM-associated Vecs would be redirected to ``/fields/``
+    by the DMPlex HDF5 writer.
+
+    The vector is written under *group* (default ``/vertex_fields``)
+    with the dataset name *name* (default ``<clean_name>_<clean_name>``
+    to match the existing XDMF convention).
+
+    Parameters
+    ----------
+    mesh_var
+        Source MeshVariable (any degree, scalar/vector/tensor).
+    viewer
+        An open ``PETSc.ViewerHDF5`` in append or write mode.
+    group
+        HDF5 group path to write into.
+    name
+        Dataset name.  Defaults to ``<clean_name>_<clean_name>``.
+    """
+
+    if name is None:
+        name = f"{mesh_var.clean_name}_{mesh_var.clean_name}"
+
+    mesh = mesh_var.mesh
+    nc = mesh_var.num_components
+    is_p1 = mesh_var.continuous and mesh_var.degree == 1
+
+    if is_p1:
+        # DOFs = vertices — use gvec data directly (already owned-only)
+        mesh_var._sync_lvec_to_gvec()
+        data = mesh_var._gvec.array.reshape(-1, nc).copy()
+    else:
+        # include_ghosts=False → owned partition only, suitable for
+        # parallel HDF5 writing where each rank writes its own slice.
+        data = project_to_degree(
+            mesh_var, target_degree=1, continuous=True, include_ghosts=False,
+        )
+
+    # Repack tensors to ParaView 9-component format for visualisation.
+    # The projected data uses the same _data_layout as the source variable.
+    import underworld3 as uw
+
+    is_tensor = hasattr(mesh_var, "vtype") and mesh_var.vtype in (
+        uw.VarType.TENSOR,
+        uw.VarType.SYM_TENSOR,
+    )
+    if is_tensor:
+        data = _repack_tensor_to_paraview(data, mesh_var.vtype, mesh.dim)
+
+    _write_vec_to_group(viewer, data, name, group, PETSc.COMM_WORLD)
+
+
+def write_coordinates_to_viewer(
+    mesh,
+    viewer: "PETSc.ViewerHDF5",
+    group: str = "/vertex_fields",
+    name: str = "coordinates",
+) -> None:
+    """Write mesh vertex coordinates via PETSc ViewerHDF5.
+
+    Parameters
+    ----------
+    mesh
+        Source mesh.
+    viewer
+        An open ``PETSc.ViewerHDF5`` in append or write mode.
+    group
+        HDF5 group path to write into.
+    name
+        Dataset name (default ``coordinates``).
+    """
+    coord_gvec = mesh.dm.getCoordinates()
+    coords = coord_gvec.array.reshape(-1, mesh.dim).copy()
+    _write_vec_to_group(viewer, coords, name, group, PETSc.COMM_WORLD)
+
+
+def write_cell_field_to_viewer(
+    mesh_var: "MeshVariable",
+    viewer: "PETSc.ViewerHDF5",
+    group: str = "/cell_fields",
+    name: str | None = None,
+) -> None:
+    """Write a cell (discontinuous/DG-0) variable via PETSc ViewerHDF5.
+
+    Data is written as a standalone Vec (no DM) so that ``pushGroup``
+    is respected.
+
+    Parameters
+    ----------
+    mesh_var
+        Source MeshVariable (discontinuous or degree 0).
+    viewer
+        An open ``PETSc.ViewerHDF5`` in append or write mode.
+    group
+        HDF5 group path to write into.
+    name
+        Dataset name.  Defaults to ``<clean_name>_<clean_name>``.
+    """
+
+    if name is None:
+        name = f"{mesh_var.clean_name}_{mesh_var.clean_name}"
+
+    nc = mesh_var.num_components
+    mesh_var._sync_lvec_to_gvec()
+    data = mesh_var._gvec.array.reshape(-1, nc).copy()
+    _write_vec_to_group(viewer, data, name, group, PETSc.COMM_WORLD)
diff --git a/tests/test_0005_xdmf_compat.py b/tests/test_0005_xdmf_compat.py
new file mode 100644
index 00000000..72ba4439
--- /dev/null
+++ b/tests/test_0005_xdmf_compat.py
@@ -0,0 +1,290 @@
+"""Test XDMF/HDF5 compatibility groups written by write_timestep.
+
+Validates that /vertex_fields/ and /cell_fields/ groups are created
+correctly, with proper component counts and data values.
+"""
+
+import os
+import re
+
+import h5py
+import numpy as np
+import pytest
+
+import underworld3 as uw
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _check_h5_group_exists(h5_path, group_path):
+    """Return True if group_path exists in the HDF5 file."""
+    with h5py.File(h5_path, "r") as f:
+        return group_path in f
+
+
+def _read_h5_dataset(h5_path, dataset_path):
+    """Read and return an HDF5 dataset as a numpy array."""
+    with h5py.File(h5_path, "r") as f:
+        return f[dataset_path][:]
+
+
+def _check_xdmf_refs(xdmf_path, tmp_dir):
+    """Verify all XDMF entity references point to real HDF5 datasets."""
+    with open(xdmf_path, "r") as f:
+        content = f.read()
+
+    doctype_match = re.search(r"<!DOCTYPE\s+Xdmf.*?\[(.*?)\]>", content, re.DOTALL)
+    assert doctype_match, "No DOCTYPE entity block found in XDMF file"
+
+    entity_block = doctype_match.group(1)
+    entities = dict(re.findall(r'<!ENTITY\s+(\w+)\s+"([^"]+\.h5)"\s*>', entity_block))
+
+    # Check both vertex_fields and cell_fields references
+    refs = re.findall(r"&(\w+);:(/(vertex_fields|cell_fields)/[A-Za-z0-9_]+)", content)
+    errors = []
+    for entity_name, dataset_path, _ in refs:
+        h5_file = entities.get(entity_name)
+        if not h5_file:
+            errors.append(f"Entity {entity_name} not found")
+            continue
+        h5_full = os.path.join(tmp_dir, h5_file)
+        if not os.path.exists(h5_full):
+            errors.append(f"File {h5_file} not found")
+            continue
+        with h5py.File(h5_full, "r") as f:
+            if dataset_path.lstrip("/") not in f:
+                errors.append(f"{h5_file}: {dataset_path} missing")
+
+    assert not errors, "XDMF reference errors:\n" + "\n".join(errors)
+
+
+# ---------------------------------------------------------------------------
+# Test: P1 scalar + P2 vector (2D)
+# ---------------------------------------------------------------------------
+
+
+def test_xdmf_compat_2d(tmp_path):
+    """write_timestep creates correct compat groups for 2D mesh variables."""
+
+    mesh = uw.meshing.StructuredQuadBox(elementRes=(4, 4))
+
+    # P1 scalar, P2 vector
+    p_var = uw.discretisation.MeshVariable("p", mesh, 1, degree=1)
+    v_var = uw.discretisation.MeshVariable("v", mesh, mesh.dim, degree=2)
+
+    # Initialise with known values
+    x, y = mesh.X
+    p_var.data[:, 0] = mesh._coords[:, 0]  # p = x coordinate
+    v_var.data[:, 0] = 1.0
+    v_var.data[:, 1] = 2.0
+
+    mesh.write_timestep(
+        "test", index=0, outputPath=str(tmp_path), meshVars=[p_var, v_var]
+    )
+
+    # Check XDMF file was created
+    xdmf_file = os.path.join(str(tmp_path), "test.mesh.00000.xdmf")
+    assert os.path.exists(xdmf_file), "XDMF file not created"
+
+    # Check P1 scalar: /vertex_fields/p_p should exist with correct shape
+    p_h5 = os.path.join(str(tmp_path), "test.mesh.p.00000.h5")
+    assert _check_h5_group_exists(p_h5, "vertex_fields/p_p")
+    assert _check_h5_group_exists(p_h5, "vertex_fields/coordinates")
+
+    p_compat = _read_h5_dataset(p_h5, "vertex_fields/p_p")
+    # PETSc writes 1-component scalars as 1D (N,); accept both (N,) and (N,1)
+    effective_ncomp = p_compat.shape[1] if p_compat.ndim == 2 else 1
+    assert effective_ncomp == 1, f"P1 scalar should have 1 component, got {effective_ncomp}"
+
+    # P1 scalar: compat values should match var.data exactly
+    p_original = _read_h5_dataset(p_h5, "fields/p")
+    np.testing.assert_allclose(
+        p_compat.ravel(), p_original.ravel(), atol=1e-10,
+        err_msg="P1 compat data should match original field data exactly"
+    )
+
+    # Check P2 vector: /vertex_fields/v_v should exist with dim components
+    v_h5 = os.path.join(str(tmp_path), "test.mesh.v.00000.h5")
+    assert _check_h5_group_exists(v_h5, "vertex_fields/v_v")
+
+    v_compat = _read_h5_dataset(v_h5, "vertex_fields/v_v")
+    # Standalone Vec writes as 1D — infer components from total size / vertex count
+    n_verts = mesh._coords.shape[0]
+    v_total = v_compat.size
+    v_ncomp = v_total // n_verts if n_verts > 0 else 0
+    assert v_ncomp == mesh.dim, (
+        f"P2 vector should have {mesh.dim} components, got {v_ncomp}"
+    )
+
+    # P2 vector: vertex count should match mesh vertices, not P2 DOFs
+    coords_compat = _read_h5_dataset(v_h5, "vertex_fields/coordinates")
+    coords_n_verts = coords_compat.size // mesh.dim
+    assert coords_n_verts == mesh._coords.shape[0], (
+        "Vertex count mismatch between compat coords and mesh"
+    )
+
+    # Verify XDMF references are valid
+    _check_xdmf_refs(xdmf_file, str(tmp_path))
+
+    del mesh
+
+
+# ---------------------------------------------------------------------------
+# Test: 3D mesh
+# ---------------------------------------------------------------------------
+
+
+def test_xdmf_compat_3d(tmp_path):
+    """write_timestep creates correct compat groups for 3D mesh."""
+
+    mesh = uw.meshing.StructuredQuadBox(
+        elementRes=(3, 3, 3),
+        minCoords=(0.0, 0.0, 0.0),
+        maxCoords=(1.0, 1.0, 1.0),
+    )
+
+    s_var = uw.discretisation.MeshVariable("s", mesh, 1, degree=1)
+    s_var.data[:, 0] = mesh._coords[:, 2]  # s = z
+
+    mesh.write_timestep(
+        "test3d", index=0, outputPath=str(tmp_path), meshVars=[s_var]
+    )
+
+    s_h5 = os.path.join(str(tmp_path), "test3d.mesh.s.00000.h5")
+    assert _check_h5_group_exists(s_h5, "vertex_fields/s_s")
+
+    s_compat = _read_h5_dataset(s_h5, "vertex_fields/s_s")
+    s_original = _read_h5_dataset(s_h5, "fields/s")
+    np.testing.assert_allclose(s_compat.ravel(), s_original.ravel(), atol=1e-10)
+
+    # Verify XDMF
+    xdmf_file = os.path.join(str(tmp_path), "test3d.mesh.00000.xdmf")
+    _check_xdmf_refs(xdmf_file, str(tmp_path))
+
+    del mesh
+
+
+# ---------------------------------------------------------------------------
+# Test: Cell (discontinuous) variable
+# ---------------------------------------------------------------------------
+
+
+def test_xdmf_compat_cell_variable(tmp_path):
+    """Cell variables go to /cell_fields/ group."""
+
+    mesh = uw.meshing.StructuredQuadBox(elementRes=(4, 4))
+
+    # Discontinuous (cell-centred) variable
+    c_var = uw.discretisation.MeshVariable(
+        "c", mesh, 1, degree=0, continuous=False
+    )
+    c_var.data[:, 0] = 42.0
+
+    mesh.write_timestep(
+        "testcell", index=0, outputPath=str(tmp_path), meshVars=[c_var]
+    )
+
+    c_h5 = os.path.join(str(tmp_path), "testcell.mesh.c.00000.h5")
+    assert _check_h5_group_exists(c_h5, "cell_fields/c_c"), (
+        "/cell_fields/c_c group should exist for discontinuous variable"
+    )
+
+    c_compat = _read_h5_dataset(c_h5, "cell_fields/c_c")
+    c_effective_ncomp = c_compat.shape[1] if c_compat.ndim == 2 else 1
+    assert c_effective_ncomp == 1
+
+    del mesh
+
+
+# ---------------------------------------------------------------------------
+# Test: read_timestep round-trip still works (checkpoint integrity)
+# ---------------------------------------------------------------------------
+
+
+def test_xdmf_checkpoint_roundtrip(tmp_path):
+    """Compat groups don't corrupt the /fields/ data used by read_timestep."""
+
+    mesh = uw.meshing.StructuredQuadBox(elementRes=(4, 4))
+
+    p1 = uw.discretisation.MeshVariable("p1", mesh, 1, degree=1)
+    p1.data[:, 0] = mesh._coords[:, 0] + mesh._coords[:, 1]
+
+    mesh.write_timestep(
+        "roundtrip", index=0, outputPath=str(tmp_path), meshVars=[p1]
+    )
+
+    # Read back
+    p1_check = uw.discretisation.MeshVariable("p1check", mesh, 1, degree=1)
+    p1_check.read_timestep("roundtrip", "p1", 0, outputPath=str(tmp_path))
+
+    np.testing.assert_allclose(p1.data, p1_check.data, atol=1e-10)
+
+    del mesh
+
+
+# ---------------------------------------------------------------------------
+# Test: create_xdmf=False skips compat groups
+# ---------------------------------------------------------------------------
+
+
+def test_no_xdmf_when_disabled(tmp_path):
+    """create_xdmf=False should not create compat groups or XDMF file."""
+
+    mesh = uw.meshing.StructuredQuadBox(elementRes=(4, 4))
+    s = uw.discretisation.MeshVariable("s", mesh, 1, degree=1)
+    s.data[:, 0] = 1.0  # initialise so gvec exists
+
+    mesh.write_timestep(
+        "noxdmf", index=0, outputPath=str(tmp_path),
+        meshVars=[s], create_xdmf=False,
+    )
+
+    # XDMF file should not be generated.
+    # Note: we do NOT check HDF5 groups because some PETSc versions (3.21+)
+    # write /vertex_fields/ automatically during var.write() — that is PETSc
+    # behaviour, not ours.
+    xdmf_file = os.path.join(str(tmp_path), "noxdmf.mesh.00000.xdmf")
+    assert not os.path.exists(xdmf_file), "XDMF file should not exist"
+
+    del mesh
+
+
+# ---------------------------------------------------------------------------
+# Test: Tensor variable repacking
+# ---------------------------------------------------------------------------
+
+
+def test_tensor_variable_repacking(tmp_path):
+    """Tensor variables are repacked to 9 components (3x3) in vertex_fields."""
+
+    mesh = uw.meshing.StructuredQuadBox(elementRes=(4, 4))
+
+    # 2D full tensor (4 components internally)
+    T = uw.discretisation.MeshVariable(
+        "T", mesh, mesh.dim, degree=1, vtype=uw.VarType.TENSOR
+    )
+    T.data[:, 0] = 1.0  # xx
+    T.data[:, 1] = 0.1  # xy
+    T.data[:, 2] = 0.2  # yx
+    T.data[:, 3] = 2.0  # yy
+
+    mesh.write_timestep(
+        "tensor", index=0, outputPath=str(tmp_path), meshVars=[T]
+    )
+
+    t_h5 = os.path.join(str(tmp_path), "tensor.mesh.T.00000.h5")
+    assert _check_h5_group_exists(t_h5, "vertex_fields/T_T")
+
+    t_compat = _read_h5_dataset(t_h5, "vertex_fields/T_T")
+    n_verts = mesh._coords.shape[0]
+    t_ncomp = t_compat.size // n_verts
+    assert t_ncomp == 9, f"Tensor should be repacked to 9 components, got {t_ncomp}"
+
+    # Verify XDMF
+    xdmf_file = os.path.join(str(tmp_path), "tensor.mesh.00000.xdmf")
+    _check_xdmf_refs(xdmf_file, str(tmp_path))
+
+    del mesh