-
Notifications
You must be signed in to change notification settings - Fork 168
Adding both an xarray.dataset and a dict-of-numpy-arrays for the particle-data #2097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding both an xarray.dataset and a dict-of-numpy-arrays for the particle-data #2097
Conversation
|
I think a difficulty with this is that once a new array is created (e..g, from the deletion of particles), then there will be drift between these two data structures. I don't think this is currently working as intended. Here's a test: def test_pset_remove_indices(fieldset):
npart = 10
lon_start = np.linspace(0, 1, npart)
lat_start = np.linspace(1, 0, npart)
pset = ParticleSet(fieldset, lon=np.linspace(0, 1, npart), lat=np.linspace(1, 0, npart))
assert len(pset._ds.lon) == len(pset._data["lon"]) == npart
pset.remove_indices([0])
assert len(pset._ds.lon) == len(pset._data["lon"]) == npart - 1 |
|
Ah, good catch! I hadn't realised that deleting a particle would create a new dataset, indeed. And so would adding two ParticleSets. We could change the code to all update the dict-of-numpy-arrays whenever we change the dataset; or is that too prone to errors? |
Adding @VeckoTheGecko's failing test showing that the dict-of-numpys does not track the xarray dataset anymore after a deletion
|
I just added the test above to the unit tests suite, so that we got failing CI and don't accidentally merge this PR |
|
Clsogn this PR, as the combination of a dict-of-numpy-arrays and an xarray Dataset does not seem robust enough (see #2097 (comment)) |
This PR builds on #2094, following @VeckoTheGecko's suggestion at Parcels-code/parcels-benchmarks#1 (comment) to keep the ParticleData in both an xarray.DataSet structure (
ParticleSet._ds) and a dict-of-numpy-arrays (ParticleSet._data).This new branch has the same performance as #2094 (see Parcels-code/parcels-benchmarks#1 (comment)), but advantage is that users can also access the data as a
xarray.DataSet.This will be useful e.g. in the
__repr__(to be implemented) and the new implementation ofParticleFileparticledata_as_dict)