Adding overwrite option to ParticleFile API by erikvansebille · Pull Request #2655 · Parcels-code/Parcels

erikvansebille · 2026-06-04T11:55:43Z

Description

This PR adds an option to ParticleFile.__init__ to control when the file already exists: either raise an error (default), or overwrite the exisiting file.

Checklist

Closes Base implementation of Parquet file writing #2583
Tests added
This PR targets the correct branch (main for normal development, v3-support for v3 support)

AI Disclosure

This PR contains AI-generated content.
- I have tested any AI-generated content in my PR.
- I take responsibility for any AI-generated content in my PR.
- Describe how you used it (e.g., by pasting your prompt):
  I used Claude code to help with the implementation of the append option in particlefile.write()

VeckoTheGecko · 2026-06-05T09:46:09Z

+    if_exists : {"error", "overwrite", "append"}, optional
+        Behavior when the output file already exists.
+        - "error" (default): raise a ValueError.
+        - "overwrite": remove the existing file before writing.
+        - "append": preserve existing rows and append new rows.


I think we should stick closer to convention here

Suggested change

if_exists : {"error", "overwrite", "append"}, optional

Behavior when the output file already exists.

- "error" (default): raise a ValueError.

- "overwrite": remove the existing file before writing.

- "append": preserve existing rows and append new rows.

mode : {"w", "a", None}, optional

Writing behaviour.

- None (default): Write dataset, and raise an error if it already exists.

- "w": Write dataset, overwriting it.

- "a": Append to dataset.

also rename ._if_exists to ._mode

Good call, address in aafde6a

VeckoTheGecko · 2026-06-08T08:01:23Z

+            self._tmp_path = self.path.with_name(f"{self.path.stem}.append_tmp{self.path.suffix}")
+            if self._tmp_path.exists():
+                self._tmp_path.unlink()
+
+            self._writer = pq.ParquetWriter(self._tmp_path, existing_schema, compression=self._compression)
+
+            # Parquet can't directly append, so we need to rewrite the existing data along with the new data.
+            for batch in existing_file.iter_batches():
+                self._writer.write_table(pa.Table.from_batches([batch], schema=existing_schema))
+        else:
+            assert not self.path.exists(), "If the file exists, the writer should already be set"
+            self._writer = pq.ParquetWriter(
+                self.path,
+                schema,
+                compression=self._compression,
+            )


Just taking a step back here - why do we need an append mode for the ParticleFile? Could users easily just create multiple particlefiles and join them into one after the fact?

I think especially since the file format of Parquet doesn't support this, and calling "append" would require rewriting the current data that we have, is maybe an indication that we either shouldn't have append or should consider something else.

You're right; it doesn't make sense to add an "append" option if we are copying the data internally anyways. I now removed it in aafde6a

The only thing users will need to be aware of, is that the last record of the first file is the same as the first second of the second file when concatenating, so they will probably need to filter these duplicate records out. Is that something we should add to the tutorial?

erikvansebille · 2026-06-09T06:13:21Z

            raise ValueError(f"outputdt must be positive/non-zero. Got {outputdt=!r}")

        self._outputdt = outputdt
+        self._mode = mode


Now that we don't use mode outside of this function anymore, do we need to make it an object attribute? What is best practice in such cases?

Fixed in 27cc62f

erikvansebille added 2 commits June 4, 2026 09:41

Adding append and overwrite options to ParticleFile API

5ad453b

Do not write first timestep if _if_exist=append

6ec5a9b

github-project-automation Bot added this to Parcels development Jun 4, 2026

github-project-automation Bot moved this to Backlog in Parcels development Jun 4, 2026

erikvansebille requested a review from VeckoTheGecko June 4, 2026 11:58

Separate initialisation of writer to separate function

07efdb0

VeckoTheGecko reviewed Jun 8, 2026

View reviewed changes

Incorporating review feedback

aafde6a

erikvansebille commented Jun 9, 2026

View reviewed changes

erikvansebille added 2 commits June 9, 2026 08:14

Reverting changes in PR to writer setup

b701aa1

Removing non-relevant code from PR

40f497e

erikvansebille requested a review from VeckoTheGecko June 9, 2026 06:16

erikvansebille changed the title ~~Adding append and overwrite options to ParticleFile API~~ Adding overwrite option to ParticleFile API Jun 9, 2026

Remove _mode as an attribute

27cc62f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding overwrite option to ParticleFile API#2655

Adding overwrite option to ParticleFile API#2655
erikvansebille wants to merge 7 commits into
Parcels-code:mainfrom
erikvansebille:particlefile_append_overwrite

erikvansebille commented Jun 4, 2026 •

edited

Loading

Uh oh!

VeckoTheGecko Jun 5, 2026

Uh oh!

erikvansebille Jun 9, 2026

Uh oh!

VeckoTheGecko Jun 8, 2026

Uh oh!

VeckoTheGecko Jun 8, 2026

Uh oh!

erikvansebille Jun 9, 2026

Uh oh!

erikvansebille Jun 9, 2026

Uh oh!

erikvansebille Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

erikvansebille commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

AI Disclosure

Uh oh!

VeckoTheGecko Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

erikvansebille Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

VeckoTheGecko Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

VeckoTheGecko Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

erikvansebille Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

erikvansebille Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

erikvansebille Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

erikvansebille commented Jun 4, 2026 •

edited

Loading