diff --git a/docs/conf.py b/docs/conf.py index debbc39b1..b00701094 100755 --- a/docs/conf.py +++ b/docs/conf.py @@ -533,7 +533,7 @@ def linkcode_resolve(domain, info): # -- Options for autoapi -------------------------------------------------- autoapi_dirs = ["../src/parcels"] -autoapi_add_toctree_entry = False +# autoapi_add_toctree_entry = False autoapi_root = "reference" autoapi_options = [ "members", diff --git a/docs/development/policies.md b/docs/development/policies.md index 6cc751240..43d6b9c8d 100644 --- a/docs/development/policies.md +++ b/docs/development/policies.md @@ -10,7 +10,7 @@ We accept that Parcels developers have their own motivation for using (or not us Remember that reviews are done by human maintainers - asking us to review code that an AI wrote but you don't understand isn't kind to these maintainers. -The [CLAUDE.md](https://github.com/Parcels-code/Parcels/blob/HEAD/CLAUDE.md) file in the repository has additional instructions for AI agents to follow when contributing to Parcels. +The [CLAUDE.md](/CLAUDE.md) file in the repository has additional instructions for AI agents to follow when contributing to Parcels. ## Versioning diff --git a/docs/getting_started/explanation_concepts.md b/docs/getting_started/explanation_concepts.md index b58ddb61e..b4fbe3a23 100644 --- a/docs/getting_started/explanation_concepts.md +++ b/docs/getting_started/explanation_concepts.md @@ -17,7 +17,7 @@ A Parcels simulation is generally built up from four different components: 3. [**Kernels**](#3-kernels). Kernels perform some specific operation on the particles every time step (e.g. advect the particles with the three-dimensional flow; or interpolate the temperature field to the particle location). 4. [**Execute**](#4-execute). Execute the simulation. The core method which integrates the operations defined in Kernels for a given runtime and timestep, and writes output to a ParticleFile. -We discuss each component in more detail below. The subsections titled **"Learn how to"** link to more detailed [how-to guide notebooks](../user_guide/index.md) and more detailed _explanations_ of Parcels functionality are included under **"Read more about"** subsections. The full list of classes and methods is in the [API reference](../reference/parcels/index.rst). If you want to learn by doing, check out the [quickstart tutorial](./tutorial_quickstart.md) to start creating your first Parcels simulation. +We discuss each component in more detail below. The subsections titled **"Learn how to"** link to more detailed [how-to guide notebooks](../user_guide/index.md) and more detailed _explanations_ of Parcels functionality are included under **"Read more about"** subsections. The full list of classes and methods is in the [API reference](../reference.md). If you want to learn by doing, check out the [quickstart tutorial](./tutorial_quickstart.md) to start creating your first Parcels simulation. ```{figure} ../_static/concepts_diagram.png :alt: Parcels concepts diagram @@ -190,6 +190,6 @@ There are many ways to analyze particle output, and although we provide [a short ```{admonition} 🖥️ Learn how to run a simulation :class: seealso -- [Choose an appropriate timestep and integrator](../user_guide/examples/tutorial_dt_integrators.ipynb) +- [Choose an appropriate timestep and integrator](../user_guide/examples/tutorial_numerical_accuracy.ipynb) - [Work with Parcels output](./tutorial_output.ipynb) ``` diff --git a/docs/getting_started/installation.md b/docs/getting_started/installation.md index 62ca73c51..ab07152e6 100644 --- a/docs/getting_started/installation.md +++ b/docs/getting_started/installation.md @@ -33,7 +33,7 @@ conda activate parcels The next time you start a terminal and want to work with Parcels, activate the environment with `conda activate parcels`. ``` -**Step 4:** Create a Jupyter Notebook or Python script to set up your first Parcels simulation! The [quickstart tutorial](tutorial_quickstart.md) is a great way to get started immediately. You can also first read about the core [Parcels concepts](./explanation_concepts.md) to familiarize yourself with the classes and methods you will use. +**Step 4:** Create a Jupyter Notebook or Python script to set up your first Parcels simulation! The [quickstart tutorial](tutorial_quickstart.md) is a great way to get started immediately. You can also first read about the core [Parcels concepts](concepts_overview.md) to familiarize yourself with the classes and methods you will use. ## Installation for developers diff --git a/docs/getting_started/tutorial_quickstart.md b/docs/getting_started/tutorial_quickstart.md index 038302c22..9635b9607 100644 --- a/docs/getting_started/tutorial_quickstart.md +++ b/docs/getting_started/tutorial_quickstart.md @@ -9,7 +9,7 @@ kernelspec: Welcome to the **Parcels** quickstart tutorial, in which we will go through all the necessary steps to run a simulation. The code in this notebook can be used as a starting point to run Parcels in your own environment. Along the way we will familiarize ourselves with some specific classes and methods. If you are ever confused about one of these and want to -read more, we have a [concepts overview](./explanation_concepts.md) discussing them in more detail. Let's dive in! +read more, we have a [concepts overview](concepts_overview.md) discussing them in more detail. Let's dive in! ## Imports diff --git a/docs/user_guide/examples/tutorial_dt_integrators.ipynb b/docs/user_guide/examples/tutorial_dt_integrators.ipynb deleted file mode 100644 index 828ade76b..000000000 --- a/docs/user_guide/examples/tutorial_dt_integrators.ipynb +++ /dev/null @@ -1,844 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "0", - "metadata": {}, - "source": [ - "# 🖥️ Choosing a timestep and integration method\n", - "\n", - "In Parcels, we can simulate virtual particles through time. Kernels which compute a process as a function of time, such as advection, can do so by numerically integrating the process over each timestep `dt`. In this guide we explore how to control the numerical accuracy of a Parcels simulation by choosing a timestep `dt` and an integration method.\n", - "\n", - "```{note}\n", - "This notebook is based on the [workshop on writing custom Kernels](https://github.com/Parcels-code/10year-anniversary-session2/blob/main/advection_and_windage.ipynb) from the 10-year Parcels event.\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "1", - "metadata": {}, - "source": [ - "## Timestep\n", - "\n", - "The choice of timestep affects the accuracy and duration of our simulation. The required accuracy of the integration ultimately depends on the desired application of the particle trajectories. In general, we want our timestep to be small enough to smoothly resolve the time-evolving process of interest, and not so small that our simulations take forever. Here, we will first look at how to estimate a priori what an appropriate timestep could be, and then at how to test a Parcels simulation to check the accuracy." - ] - }, - { - "cell_type": "markdown", - "id": "2", - "metadata": {}, - "source": [ - "### A priori estimation of `dt`\n", - "\n", - "In this example, we will estimate the appropriate timestep to compute advection of a virtual particle through an ocean current field. \n", - "\n", - "Advection, the change in position $\\mathbf{x}(t) = (lon(t), lat(t))$ at time $t$, can be described by the equation:\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "\\frac{\\text{d}\\mathbf{x}(t)}{\\text{d}t} = \\mathbf{v}(\\mathbf{x}(t),t),\n", - "\\end{aligned}\n", - "$$\n", - "\n", - "where $\\mathbf{v}(\\mathbf{x},t) = (u(\\mathbf{x},t), v(\\mathbf{x},t))$ describes the ocean velocity field at position $\\mathbf{x}$ at time $t$.\n", - "\n", - "To estimate the timescale that we want to resolve, we can think about the scales at which advection varies. Here we use the daily velocity fields at 1/12th degree horizontal resolution from the Copernicus Marine Service. This means that the velocity will vary in time at scales >= 24 hours, and in space at scales >= 1/12th degree." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3", - "metadata": {}, - "outputs": [], - "source": [ - "import matplotlib.pyplot as plt\n", - "import numpy as np\n", - "import pandas as pd\n", - "import xarray as xr\n", - "\n", - "import parcels\n", - "\n", - "# Load the CopernicusMarine data in the Agulhas region from the example_datasets\n", - "example_dataset_folder = parcels.download_example_dataset(\n", - " \"CopernicusMarine_data_for_Argo_tutorial\"\n", - ")\n", - "\n", - "ds_fields = xr.open_mfdataset(f\"{example_dataset_folder}/*.nc\", combine=\"by_coords\")\n", - "ds_fields.load() # load the dataset into memory\n", - "\n", - "fieldset = parcels.FieldSet.from_copernicusmarine(ds_fields)\n", - "\n", - "# Check field resolution in time and space\n", - "print(\n", - " f\"dt = {np.unique(np.diff(ds_fields.time.values).astype('timedelta64[h]'))} hours\"\n", - ")\n", - "print(f\"dlat = {np.unique(np.diff(ds_fields.latitude.values))} degrees\")\n", - "print(f\"dlon = {np.unique(np.diff(ds_fields.longitude.values))} degrees\")" - ] - }, - { - "cell_type": "markdown", - "id": "4", - "metadata": {}, - "source": [ - "In our Parcels simulation, we must ensure that:\n", - "- `dt` < 24 hours\n", - "- `particles.dlon`/`particles.dlat` < 1/12th degree. \n", - "\n", - "We can rewrite the second condition using the scale relation:\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "\\text{d}lon \\propto U \\text{d}t,\n", - "\\end{aligned}\n", - "$$\n", - "\n", - "to express the condition as a function of `dt`:\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "U \\text{d}t < 1/12 \\text{ (degrees lon)}\n", - "\\end{aligned}\n", - "$$\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "\\text{d}t < \\frac{1}{12 U}\n", - "\\end{aligned}\n", - "$$\n", - "\n", - "We can use the maximum velocity (in degrees s-1) in our flow field as a conservative estimate for $U$:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5", - "metadata": {}, - "outputs": [], - "source": [ - "# Use the maximum velocity at the surface to scale dlon/dlat\n", - "U_max_surface = np.nanmax(np.hypot(ds_fields.uo, ds_fields.vo))\n", - "print(f\"U_max = {str(np.round(U_max_surface, 2))} m s-1\")\n", - "\n", - "# convert to degrees s-1 (at lat = 30 deg S, lon = 31 deg E)\n", - "U_max_surface_deg = parcels.GeographicPolar().to_target(U_max_surface, 0, -30, 31)\n", - "print(f\" == {str(np.round(U_max_surface_deg * 1e5, 2))}e-5 degrees s-1\")" - ] - }, - { - "cell_type": "markdown", - "id": "6", - "metadata": {}, - "source": [ - "```{admonition} 🖥️ Spherical grids and unit converters\n", - ":class: seealso\n", - "Our displacement occurs in units of longitude and latitde, but our velocity field is in m/s. Read the [UnitConversion guide](./tutorial_unitconverters.ipynb) to see how Parcels uses `parcels.GeographicPolar()` under the hood to convert from m s-1 to degrees s-1.\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "7", - "metadata": {}, - "source": [ - "$$\n", - "\\begin{aligned}\n", - "\\text{d}t < \\frac{1}{12 U_{max}}\n", - "\\end{aligned}\n", - "$$\n", - "\n", - "Using `U_max_surface_deg`, we find a second estimated limit of an appropriate `dt`:\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "\\text{d}t < \\frac{1}{12 * 1.71e-5} = 4.9e3 \\text{ seconds}\n", - "\\end{aligned}\n", - "$$\n", - "\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "\\text{d}t < 2 \\text{ hours}\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "8", - "metadata": {}, - "source": [ - "### Test timesteps\n", - "\n", - "Now that we have an estimate of an appropriate choice for `dt`, we can test a range of values in Parcels to understand the impact on the accuracy of our trajectories. To do so, we can run simulations using `parcels.kernels.AdvectionRK2` for a limited number of particles and quantify the differences." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9", - "metadata": {}, - "outputs": [], - "source": [ - "# Particle locations and initial time\n", - "npart = 5 # number of particles to be released\n", - "initial_release_lons = 32 * np.ones(npart)\n", - "initial_release_lats = np.linspace(-32.5, -32, npart, dtype=np.float32)\n", - "initial_release_times = np.repeat(\n", - " ds_fields.time.values[0], npart\n", - ") # release all particles at the start time of the fieldset\n", - "initial_release_zs = np.repeat(ds_fields.depth.values[0], npart)" - ] - }, - { - "cell_type": "markdown", - "id": "10", - "metadata": {}, - "source": [ - "We simulate particles using a range of timesteps that differ by a factor 2-10, starting at (dt < 24 hours). We also keep track of the time it takes to run each simulation:\n", - "\n", - "```{warning}\n", - "`dt` must be chosen such that $k$ $\\text{d}t$ fits exactly in the FieldSet dt (24 hours), where $k$ is an integer. \n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "11", - "metadata": {}, - "outputs": [], - "source": [ - "import time\n", - "\n", - "runtime = np.timedelta64(7, \"D\") # Total simulation time\n", - "\n", - "dt_choices = [\n", - " np.timedelta64(12, \"h\"),\n", - " np.timedelta64(6, \"h\"),\n", - " np.timedelta64(1, \"h\"),\n", - " np.timedelta64(20, \"m\"),\n", - " np.timedelta64(5, \"m\"),\n", - "]\n", - "\n", - "sim_duration = np.zeros(len(dt_choices))\n", - "for i, dt in enumerate(dt_choices):\n", - " pset = parcels.ParticleSet(\n", - " fieldset=fieldset,\n", - " pclass=parcels.Particle,\n", - " time=initial_release_times,\n", - " z=initial_release_zs,\n", - " lat=initial_release_lats,\n", - " lon=initial_release_lons,\n", - " )\n", - " outputdt = dt\n", - " chunks = int(\n", - " runtime / outputdt / 2\n", - " ) # Because we will store a lot of positions, to speed up our simulation we need to chunk the output datafile\n", - "\n", - " pfile = parcels.ParticleFile(\n", - " store=f\"output/AdvectionRK2_dt_{int(dt / np.timedelta64(1, 's'))}s.zarr\",\n", - " outputdt=outputdt,\n", - " chunks=(len(pset), chunks),\n", - " )\n", - "\n", - " print(f\"Begin simulation for dt = {int(dt / np.timedelta64(1, 's'))} s\")\n", - " # time and run simulation\n", - " start_time = time.time()\n", - " pset.execute(\n", - " parcels.kernels.AdvectionRK2,\n", - " runtime=runtime,\n", - " dt=dt,\n", - " output_file=pfile,\n", - " verbose_progress=False,\n", - " )\n", - " sim_duration_i = time.time() - start_time\n", - " sim_duration[i] = sim_duration_i\n", - " print(\n", - " f\"Simulation duration (dt = {int(dt / np.timedelta64(1, 's'))} s) = {np.round(sim_duration_i, 2)} seconds\"\n", - " )" - ] - }, - { - "cell_type": "markdown", - "id": "12", - "metadata": {}, - "source": [ - "Let's start by comparing the trajectories of the different simulations:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "13", - "metadata": {}, - "outputs": [], - "source": [ - "# To compare the results visually, we will need distinct colour maps for each timestep\n", - "dt_colours = np.linspace(0, 1, len(dt_choices), endpoint=True)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "14", - "metadata": {}, - "outputs": [], - "source": [ - "fig = plt.figure()\n", - "ax = plt.axes()\n", - "temperature = ds_fields.isel(time=0, depth=0).thetao.plot(cmap=\"Greys\")\n", - "for j, dt in enumerate(dt_choices):\n", - " ds = xr.open_zarr(\n", - " f\"output/AdvectionRK2_dt_{int(dt / np.timedelta64(1, 's'))}s.zarr\"\n", - " )\n", - " labels = [f\"dt = {str(dt)}\"] + [None] * (ds.lon.shape[0] - 1)\n", - " ax.plot(\n", - " ds.lon.T,\n", - " ds.lat.T,\n", - " alpha=0.75,\n", - " color=plt.cm.viridis(dt_colours[j]),\n", - " label=labels,\n", - " )\n", - "ax.scatter(ds.lon[:, 0], ds.lat[:, 0], c=\"r\", marker=\"s\", label=\"starting locations\")\n", - "ax.legend()\n", - "ax.set_ylim(-32.7, -31.3)\n", - "ax.set_xlim(31, 32.4)\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "id": "15", - "metadata": {}, - "source": [ - "We see that as the timestep becomes smaller, the trajectories start to converge. As expected from our horizontal grid scaling, the 2 simulations with `dt` > 2 hours produce trajectories that end up in different grid cells. The other three simulations (`dt` = (1 h, 20 min, 5 min)) are more difficult to separate visually. \n", - "\n", - "To quantify the precision of these simulations, we can compute the separation distance between the trajectories with the same starting location. Since we have no exact solution to these trajectories, we use our highest resolution simulation as a benchmark. The separation distance is a common metric to analyse particle trajectories, as described in [the Lagrangian Diagnostics project](https://lagrangian-diags.readthedocs.io/en/latest/tutorials/absolute_distance_method01.html)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "16", - "metadata": {}, - "outputs": [], - "source": [ - "def dist_km(lona, lonb, lata, latb):\n", - " \"\"\"\n", - " Function to calculate the distance between 2 points in km\n", - " Haversine formula used, which assumes the Earth is a sphere.\n", - " source: https://stackoverflow.com/questions/19412462/getting-distance-between-two-points-based-on-latitude-longitude\n", - " \"\"\"\n", - "\n", - " R = 6371.0 # approximate radius of earth in km\n", - "\n", - " lat1 = np.radians(lata)\n", - " lon1 = np.radians(lona)\n", - " lat2 = np.radians(latb)\n", - " lon2 = np.radians(lonb)\n", - "\n", - " dlon = lon2 - lon1\n", - " dlat = lat2 - lat1\n", - "\n", - " a = np.square(np.sin(dlat / 2)) + np.cos(lat1) * np.cos(lat2) * np.square(\n", - " np.sin(dlon / 2)\n", - " )\n", - " c = 2 * np.atan2(np.sqrt(a), np.sqrt(1 - a))\n", - "\n", - " distance = R * c\n", - "\n", - " return distance" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "17", - "metadata": {}, - "outputs": [], - "source": [ - "dist_end = np.zeros((len(dt_choices) - 1, npart))\n", - "\n", - "fig, axs = plt.subplots(nrows=1, ncols=2, figsize=(10, 5))\n", - "axs[0].set_title(\"Separation distance from dt = 5-min trajectory\")\n", - "axs[0].set_xlabel(\"Time\")\n", - "axs[0].tick_params(\"x\", rotation=45)\n", - "axs[0].set_yscale(\"log\")\n", - "axs[0].set_ylim(1e-2, 1e2)\n", - "\n", - "axs[1].set_title(\"Linear y-scale\")\n", - "axs[1].set_xlabel(\"Time\")\n", - "axs[1].tick_params(\"x\", rotation=45)\n", - "axs[1].set_ylim(0, 50)\n", - "\n", - "# set 5 minute dt as benchmark\n", - "ds_5min = xr.open_zarr(f\"output/AdvectionRK2_dt_300s.zarr\")\n", - "for i, dt in enumerate(dt_choices[:-1]):\n", - " ds = xr.open_zarr(\n", - " f\"output/AdvectionRK2_dt_{int(dt / np.timedelta64(1, 's'))}s.zarr\"\n", - " )\n", - " labels = [f\"dt = {str(dt)}\"] + [None] * (ds.lon.shape[0] - 1)\n", - "\n", - " # subset 5 minute data to match dt\n", - " lon_5min_sub = ds_5min.lon.where(\n", - " ds_5min.time.isin(ds.time.values).compute(), drop=True\n", - " ).values\n", - " lat_5min_sub = ds_5min.lat.where(\n", - " ds_5min.time.isin(ds.time.values).compute(), drop=True\n", - " ).values\n", - "\n", - " # remove nans\n", - " lon_valid = ds.lon.where(~np.isnan(ds.lon).compute(), drop=True).values\n", - " lat_valid = ds.lat.where(~np.isnan(ds.lat).compute(), drop=True).values\n", - "\n", - " # compute separation distance between each particle in km\n", - " dist = dist_km(lon_valid, lon_5min_sub, lat_valid, lat_5min_sub)\n", - "\n", - " # plot\n", - " time_valid = ds.time.where(~np.isnan(ds.time).compute(), drop=True)\n", - " axs[0].plot(\n", - " time_valid.T,\n", - " dist.T,\n", - " alpha=0.75,\n", - " color=plt.cm.viridis(dt_colours[i]),\n", - " label=labels,\n", - " )\n", - " axs[1].plot(\n", - " time_valid.T,\n", - " dist.T,\n", - " alpha=0.75,\n", - " color=plt.cm.viridis(dt_colours[i]),\n", - " label=labels,\n", - " )\n", - " dist_end[i] = dist[:, -1]\n", - "axs[0].legend()\n", - "axs[0].set_ylabel(\"Distance (km)\")\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "id": "18", - "metadata": {}, - "source": [ - "```{warning}\n", - "As you can see, the separation distance starts to increase rapidly with time. Small errors can grow, and the accuracy of the integration therefore depends strongly on the runtime and flow conditions (see [the discussion on flow conditions](#flow-conditions) at the end of this guide).\n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "19", - "metadata": {}, - "outputs": [], - "source": [ - "fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12, 5))\n", - "for i, dt in enumerate(dt_choices[:-1]):\n", - " ax[0].scatter(\n", - " (dt / np.timedelta64(1, \"m\")).astype(int),\n", - " np.mean(dist_end[i]),\n", - " color=plt.cm.viridis(dt_colours[i]),\n", - " label=f\"dt = {str(dt)}\",\n", - " )\n", - "ax[0].plot(\n", - " (dt_choices[:-1] / np.timedelta64(1, \"m\")).astype(int), np.mean(dist_end, axis=1)\n", - ")\n", - "\n", - "ax[0].set_ylabel(\"Mean separation distance (km)\")\n", - "ax[0].set_xlabel(\"dt (minutes)\")\n", - "ax[0].grid()\n", - "ax[1].plot(dt_choices, sim_duration)\n", - "for i, dt in enumerate(dt_choices):\n", - " ax[1].scatter(\n", - " (dt / np.timedelta64(1, \"m\")).astype(int),\n", - " sim_duration[i],\n", - " color=plt.cm.viridis(dt_colours[i]),\n", - " label=f\"dt = {str(dt)}\",\n", - " )\n", - "ax[1].set_ylabel(\"Simulation Duration (s)\")\n", - "ax[1].set_xlabel(\"dt (minutes)\")\n", - "ax[1].grid()\n", - "ax[1].legend()\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "id": "20", - "metadata": {}, - "source": [ - "We can see that in our simulation advecting particles for 7 days, the effect of `dt` on the precision of our simulation is approximately linear. The precision for a simulation with a timestep of 20 minutes is order of magnitude ~100 m. The effect on the time it takes to run a simulation is not linear in our case however; it increases sharply as we decrease our timestep. This may be optimized using more efficient chunking." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "21", - "metadata": {}, - "outputs": [], - "source": [ - "df = pd.DataFrame(\n", - " {\n", - " \"dt\": dt_choices,\n", - " \"Mean separation distance (km)\": np.append(\n", - " np.round(np.mean(dist_end, axis=1), 2), [None]\n", - " ),\n", - " \"Simulation duration (s)\": np.round(sim_duration, 2),\n", - " }\n", - ")\n", - "df" - ] - }, - { - "cell_type": "markdown", - "id": "22", - "metadata": {}, - "source": [ - "## Integration schemes\n", - "\n", - "In addition to our choice of timestep, the method we use to integrate the process of interest can have an effect on the accuracy of our calculations. Parcels comes with a number of built-in advection Kernels. These kernels use different integration methods to compute the change in position from one timestep to the next. The higher-order [Runge-Kutta (RK) methods](https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods) are generally more accurate than the simple Explicit Euler (EE)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "23", - "metadata": {}, - "outputs": [], - "source": [ - "import inspect\n", - "\n", - "# To get a sense of what the built-in Euler-forward scheme looks like, let's look at the source code!\n", - "print(inspect.getsource(parcels.kernels.AdvectionEE))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "24", - "metadata": {}, - "outputs": [], - "source": [ - "# And the second-order Runge-Kutta (RK2) method\n", - "print(inspect.getsource(parcels.kernels.AdvectionRK2))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "25", - "metadata": {}, - "outputs": [], - "source": [ - "# And RK4:\n", - "print(inspect.getsource(parcels.kernels.AdvectionRK4))" - ] - }, - { - "cell_type": "markdown", - "id": "26", - "metadata": {}, - "source": [ - "The higher-order methods use weighted intermediate steps in time and space to obtain a more accurate estimate of `dlat` and `dlon` for a given timestep.\n", - "\n", - "Now let's see how the different advection schemes affect our simulation. As before, there is no exact solution to the simulated trajectories, so we will use the highest order method as the benchmark." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "27", - "metadata": {}, - "outputs": [], - "source": [ - "advection_schemes = [\n", - " parcels.kernels.AdvectionEE,\n", - " parcels.kernels.AdvectionRK2,\n", - " parcels.kernels.AdvectionRK4,\n", - "]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "28", - "metadata": { - "tags": [ - "hide-output" - ] - }, - "outputs": [], - "source": [ - "sim_duration = np.zeros((len(advection_schemes), len(dt_choices)))\n", - "\n", - "for i, advection_scheme in enumerate(advection_schemes):\n", - " for j, dt in enumerate(dt_choices):\n", - " pset = parcels.ParticleSet(\n", - " fieldset=fieldset,\n", - " pclass=parcels.Particle,\n", - " time=initial_release_times,\n", - " z=initial_release_zs,\n", - " lat=initial_release_lats,\n", - " lon=initial_release_lons,\n", - " )\n", - " outputdt = dt\n", - " chunks = int(\n", - " runtime / outputdt / 2\n", - " ) # Because we will store a lot of positions, to speed up our simulation we need to chunk the output datafile\n", - "\n", - " pfile = parcels.ParticleFile(\n", - " store=f\"output/{advection_scheme.__name__}_dt_{int(dt / np.timedelta64(1, 's'))}s.zarr\",\n", - " outputdt=outputdt,\n", - " chunks=(len(pset), chunks),\n", - " )\n", - "\n", - " print(\n", - " f\"Begin simulation for (scheme: {advection_scheme.__name__}, dt={int(dt / np.timedelta64(1, 's'))} s)\"\n", - " )\n", - " start_time = time.time()\n", - " pset.execute(\n", - " advection_scheme,\n", - " runtime=runtime,\n", - " dt=dt,\n", - " output_file=pfile,\n", - " verbose_progress=False,\n", - " )\n", - " sim_duration_ij = time.time() - start_time\n", - " sim_duration[i, j] = sim_duration_ij\n", - " print(\n", - " f\"Simulation duration (dt = {int(dt / np.timedelta64(1, 's'))} s) = {np.round(sim_duration_ij, 2)} seconds\"\n", - " )" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "29", - "metadata": {}, - "outputs": [], - "source": [ - "scheme_colours = np.linspace(0, 1, len(advection_schemes), endpoint=True)\n", - "# Now let's compare different advection schemes with the same timestep\n", - "fig, axs = plt.subplots(nrows=2, ncols=3, figsize=(15, 10))\n", - "for i, dt in enumerate(dt_choices):\n", - " m = i // 3\n", - " n = i % 3\n", - " axs[m, n].set_title(f\"dt = {str(dt)}\")\n", - " axs[m, n].set_xlabel(\"Longitude\")\n", - " for j, advection_scheme in enumerate(advection_schemes):\n", - " ds = xr.open_zarr(\n", - " f\"output/{advection_scheme.__name__}_dt_{int(dt / np.timedelta64(1, 's'))}s.zarr\"\n", - " )\n", - " labels = [f\"{advection_scheme.__name__}\"] + [None] * (ds.lon.shape[0] - 1)\n", - " axs[m, n].plot(\n", - " ds.lon.T,\n", - " ds.lat.T,\n", - " alpha=0.75,\n", - " color=plt.cm.viridis(scheme_colours[j]),\n", - " label=labels,\n", - " )\n", - " axs[m, n].scatter(\n", - " ds.lon[:, 0], ds.lat[:, 0], c=\"r\", marker=\"s\", label=\"starting locations\"\n", - " )\n", - " axs[m, n].grid()\n", - "axs[-1, -1].axis(\"off\")\n", - "axs[0, 0].legend()\n", - "axs[0, 0].set_ylabel(\"Latitude\")\n", - "axs[1, 0].set_ylabel(\"Latitude\")\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "id": "30", - "metadata": {}, - "source": [ - "Clearly, for longer timesteps, the RK2 and RK4 schemes perform better. However, if the timestep is appropriate, as we have determined in the previous section, then the Explicit Euler scheme does not perform notably different." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "31", - "metadata": {}, - "outputs": [], - "source": [ - "dist_end = np.zeros((len(advection_schemes) - 1, len(dt_choices), npart))\n", - "\n", - "fig, axs = plt.subplots(nrows=2, ncols=3, figsize=(15, 10))\n", - "\n", - "for i, dt in enumerate(dt_choices):\n", - " m = i // 3\n", - " n = i % 3\n", - " axs[m, n].set_title(f\"dt = {str(dt)}\")\n", - " axs[m, n].set_xlabel(\"Time\")\n", - " axs[m, n].tick_params(\"x\", rotation=45)\n", - " axs[m, n].set_yscale(\"log\")\n", - " axs[m, n].set_ylim(1e-4, 1e1)\n", - " ds_RK4 = xr.open_zarr(\n", - " f\"output/AdvectionRK4_dt_{int(dt / np.timedelta64(1, 's'))}s.zarr\"\n", - " )\n", - " for j, advection_scheme in enumerate(advection_schemes[:-1]):\n", - " ds = xr.open_zarr(\n", - " f\"output/{advection_scheme.__name__}_dt_{int(dt / np.timedelta64(1, 's'))}s.zarr\"\n", - " )\n", - " labels = [f\"|{advection_scheme.__name__} - AdvectionRK4|\"] + [None] * (\n", - " ds.lon.shape[0] - 1\n", - " )\n", - "\n", - " # remove nans\n", - " lon_valid_RK4 = ds_RK4.lon.where(\n", - " ~np.isnan(ds_RK4.lon).compute(), drop=True\n", - " ).values\n", - " lat_valid_RK4 = ds_RK4.lat.where(\n", - " ~np.isnan(ds_RK4.lat).compute(), drop=True\n", - " ).values\n", - " lon_valid = ds.lon.where(~np.isnan(ds.lon).compute(), drop=True).values\n", - " lat_valid = ds.lat.where(~np.isnan(ds.lat).compute(), drop=True).values\n", - " dist = dist_km(lon_valid, lon_valid_RK4, lat_valid, lat_valid_RK4)\n", - " time_valid = ds.time.where(~np.isnan(ds.time).compute(), drop=True).values\n", - " axs[m, n].plot(\n", - " time_valid.T,\n", - " dist.T,\n", - " alpha=0.75,\n", - " color=plt.cm.viridis(scheme_colours[j]),\n", - " label=labels,\n", - " )\n", - " dist_end[j, i] = dist[:, -1]\n", - " axs[m, n].grid()\n", - "axs[-1, -1].axis(\"off\")\n", - "axs[0, 0].legend()\n", - "axs[0, 0].set_ylabel(\"Latitude\")\n", - "axs[1, 0].set_ylabel(\"Latitude\")\n", - "plt.tight_layout()\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "id": "32", - "metadata": {}, - "source": [ - "By quantifying the precision of the integration methods, we can see that for a given timestep the Runge-Kutta methods perform orders of magnitude better than the Explicit Euler method. In this example, the error associated with the selected integration methods is smaller than that of the range of timesteps." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "33", - "metadata": {}, - "outputs": [], - "source": [ - "fig, axs = plt.subplots(nrows=1, ncols=2, figsize=(12, 5))\n", - "axs[0].plot(\n", - " (dt_choices / np.timedelta64(1, \"m\")).astype(int),\n", - " np.nanmean(dist_end[0], axis=1),\n", - " \"-o\",\n", - " color=plt.cm.viridis(scheme_colours[0]),\n", - " label=\"AdvectionEE - AdvectionRK4\",\n", - ")\n", - "axs[0].plot(\n", - " (dt_choices / np.timedelta64(1, \"m\")).astype(int),\n", - " np.nanmean(dist_end[1], axis=1),\n", - " \"-o\",\n", - " color=plt.cm.viridis(scheme_colours[1]),\n", - " label=\"AdvectionRK2 - AdvectionRK4\",\n", - ")\n", - "axs[0].set_ylabel(\"Mean separation distance (km)\")\n", - "axs[0].set_yscale(\"log\")\n", - "axs[0].set_xlabel(\"dt (minutes)\")\n", - "axs[0].legend()\n", - "axs[0].grid()\n", - "for i, advection_scheme in enumerate(advection_schemes):\n", - " axs[1].plot(\n", - " (dt_choices / np.timedelta64(1, \"m\")).astype(int),\n", - " sim_duration[i],\n", - " \"-o\",\n", - " color=plt.cm.viridis(scheme_colours[i]),\n", - " label=advection_scheme.__name__,\n", - " )\n", - "axs[1].set_ylabel(\"Simulation Duration (s)\")\n", - "axs[1].set_yscale(\"log\")\n", - "axs[1].set_xlabel(\"dt (minutes)\")\n", - "axs[1].grid()\n", - "axs[1].legend()\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "id": "34", - "metadata": {}, - "source": [ - "In this last figure, we see that the improvement of RK2 by orders of magnitude with respect to EE comes at a small computational cost. Since the RK2 and RK4 methods are practically indistinguishable, using the RK2 method with a timestep of 1 hour or 20 minutes, depending on the application, would be an appropriate choice for this simulation." - ] - }, - { - "cell_type": "markdown", - "id": "35", - "metadata": {}, - "source": [ - "### Flow conditions\n", - "The limitation to numerical accuracy of the integration introduces a positional difference of a certain order of magnitude ($\\delta$), as seen in the figures above. After this initial deviation, the time evolution of the separation distance is a feature of the specific flow field. As with the [Lorenz equations](https://en.wikipedia.org/wiki/Lorenz_system), two initially close particles often separate over time. The rate of separation from a finite perturbation ($\\delta$) over a given timeperiod can be characterized by the _Finite Size Lyapunov Exponent_.\n", - "\n", - "```{admonition} Finite Size Lyapunov Exponent references \n", - ":class: tip\n", - "- [Cencini and Vulpiani, 2013, _J. Phys. A: Math. Theor._](https://doi.org/10.1088/1751-8113/46/25/254019)\n", - "- [Meunier and LaCasce, 2021, _Fluids_](https://doi.org/10.3390/fluids6100348)\n", - "- [Lagrangian Diagnostics example](https://lagrangian-diags.readthedocs.io/en/latest/tutorials/FTLE_method.html) to calculate the related Finite *Time* Lyapunov Exponent\n", - "```\n", - "\n", - "We can see that the separation rate of particles with more precise simulations does not depend mainly on the initial perturbation, but follows a rather specific time-evolution, as the particles enter different flow conditions. The final accuracy of your simulation may therefore strongly depend on your flow conditions and runtime.\n", - "\n", - "```{admonition} 🖥️ Lorenz equations in Parcels kernels\n", - ":class: seealso\n", - "Check out [this notebook](https://github.com/Parcels-code/10year-anniversary-session2/blob/main/solutions/lorenz_and_lotka_volterra_solutions.ipynb) to learn how to model the Lorenz equations with Parcels!\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "36", - "metadata": {}, - "source": [ - "## Summary\n", - "If you want to test the accuracy and efficiency of your own simulation, here is a brief list of things to consider:\n", - "- **Use the temporal and spatial resolution of the input to estimate the timescales you need to resolve.** `dt` must fit into the FieldSet dt by a factor of an integer.\n", - "- **Run the simulation for the full runtime.** Many errors grow over time so to quantify these, we must take into account the full length of the simulation.\n", - "- **Consider which step introduces the largest error.** If we compute advection and diffusion, increasing the accuracy of one process much more than the other will not improve the results." - ] - }, - { - "cell_type": "markdown", - "id": "37", - "metadata": {}, - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "test-notebooks", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.13.9" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/docs/user_guide/examples/tutorial_numerical_accuracy.ipynb b/docs/user_guide/examples/tutorial_numerical_accuracy.ipynb new file mode 100644 index 000000000..3e801bdc1 --- /dev/null +++ b/docs/user_guide/examples/tutorial_numerical_accuracy.ipynb @@ -0,0 +1,23 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "0", + "metadata": {}, + "source": [ + "# 🖥️ Choosing an appropriate timestep and integrator \n", + "\n", + "```{note}\n", + "TODO: write notebook describing how to use `dt` and integration methods in Kernels to control numerical accuracy. Use [Michaels tutorial](https://github.com/Parcels-code/10year-anniversary-session2/blob/main/advection_and_windage.ipynb) as a starting point.\n", + "```" + ] + } + ], + "metadata": { + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/user_guide/index.md b/docs/user_guide/index.md index 8bbb41651..a46289455 100644 --- a/docs/user_guide/index.md +++ b/docs/user_guide/index.md @@ -1,6 +1,6 @@ # User guide -The core of our user guide is a series of Jupyter notebooks which document how to implement specific Lagrangian simulations with the flexibility of **Parcels**. Before diving into these advanced _how-to_ guides (🖥️), we suggest users get started by reading the explanation (📖) of the core concepts and trying the tutorials (🎓). For a description of the specific classes and functions, check out the [API reference](../reference/parcels/index.md). To discover other community resources, check out our [Community](../community/index.md) page. +The core of our user guide is a series of Jupyter notebooks which document how to implement specific Lagrangian simulations with the flexibility of **Parcels**. Before diving into these advanced _how-to_ guides (🖥️), we suggest users get started by reading the explanation (📖) of the core concepts and trying the tutorials (🎓). For a description of the specific classes and functions, check out the [API reference](../reference.md). To discover other community resources, check out our [Community](../community/index.md) page. ```{note} The tutorials written for Parcels v3 are currently being updated for Parcels v4. Shown below are only the notebooks which have been updated. @@ -72,12 +72,10 @@ examples/tutorial_interpolation.ipynb ```{toctree} :caption: Run a simulation :name: tutorial-execute -:titlesonly: -examples/tutorial_dt_integrators.ipynb +examples/tutorial_numerical_accuracy.ipynb ``` - @@ -87,6 +85,7 @@ examples/tutorial_dt_integrators.ipynb ``` --> + diff --git a/src/parcels/_core/converters.py b/src/parcels/_core/converters.py index dc28c4a0f..a3ecd965e 100644 --- a/src/parcels/_core/converters.py +++ b/src/parcels/_core/converters.py @@ -46,10 +46,10 @@ class Unity(UnitConverter): source_unit: None target_unit: None - def to_target(self, value, z=None, y=None, x=None): + def to_target(self, value, z, y, x): return value - def to_source(self, value, z=None, y=None, x=None): + def to_source(self, value, z, y, x): return value @@ -59,10 +59,10 @@ class Geographic(UnitConverter): source_unit = "m" target_unit = "degree" - def to_target(self, value, z=None, y=None, x=None): + def to_target(self, value, z, y, x): return value / 1000.0 / 1.852 / 60.0 - def to_source(self, value, z=None, y=None, x=None): + def to_source(self, value, z, y, x): return value * 1000.0 * 1.852 * 60.0 @@ -87,10 +87,10 @@ class GeographicSquare(UnitConverter): source_unit = "m2" target_unit = "degree2" - def to_target(self, value, z=None, y=None, x=None): + def to_target(self, value, z, y, x): return value / pow(1000.0 * 1.852 * 60.0, 2) - def to_source(self, value, z=None, y=None, x=None): + def to_source(self, value, z, y, x): return value * pow(1000.0 * 1.852 * 60.0, 2) diff --git a/src/parcels/_core/field.py b/src/parcels/_core/field.py index bbeff928b..b8c22ac0a 100644 --- a/src/parcels/_core/field.py +++ b/src/parcels/_core/field.py @@ -54,25 +54,22 @@ class Field: """The Field class that holds scalar field data. The `Field` object is a wrapper around a xarray.DataArray or uxarray.UxDataArray object. Additionally, it holds a dynamic Callable procedure that is used to interpolate the field data. - During initialization, the user is required to supply a custom interpolation method that is used - to interpolate the field data, so long as the interpolation method has the correct signature. + During initialization, the user is required to supply a custom interpolation method that is used to interpolate the field data, + so long as the interpolation method has the correct signature. Notes ----- The xarray.DataArray or uxarray.UxDataArray object contains the field data and metadata. + * dims: (time, [nz1 | nz], [face_lat | node_lat | edge_lat], [face_lon | node_lon | edge_lon]) + * attrs: (location, mesh, mesh) - * dims: (time, [nz1 | nz], [face_lat | node_lat | edge_lat], [face_lon | node_lon | edge_lon]) - * attrs: (location, mesh, mesh) - - When using a xarray.DataArray object: - + When using a xarray.DataArray object, * The xarray.DataArray object must have the "location" and "mesh" attributes set. - * The "location" attribute must be set to one of the following to define which pairing of points a field is associated with: - * "node" - * "face" - * "x_edge" - * "y_edge" - + * The "location" attribute must be set to one of the following to define which pairing of points a field is associated with. + * "node" + * "face" + * "x_edge" + * "y_edge" * For an A-Grid, the "location" attribute must be set to / is assumed to be "node" (node_lat,node_lon). * For a C-Grid, the "location" setting for a field has the following interpretation: * "node" ~> the field is associated with the vorticity points (node_lat, node_lon) @@ -80,8 +77,7 @@ class Field: * "x_edge" ~> the field is associated with the u-velocity points (face_lat, node_lon) * "y_edge" ~> the field is associated with the v-velocity points (node_lat, face_lon) - When using a uxarray.UxDataArray object: - + When using a uxarray.UxDataArray object, * The uxarray.UxDataArray.UxGrid object must have the "Conventions" attribute set to "UGRID-1.0" and the uxarray.UxDataArray object must comply with the UGRID conventions. See https://ugrid-conventions.github.io/ugrid-conventions/ for more information.