Skip to content

Commit 3c052ef

Browse files
timsaucerclaude
andcommitted
test: drop xfail on timestamp[s] parquet roundtrip
pyarrow.parquet promotes timestamp[s] to timestamp[ms] on write (apache/arrow#41382), so the read array never matched the input. Cast the expected array to timestamp[ms] in test_simple_select to assert DataFusion reads what Arrow actually stored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 5387f30 commit 3c052ef

1 file changed

Lines changed: 10 additions & 4 deletions

File tree

python/tests/test_sql.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -450,13 +450,9 @@ def test_udf(
450450
pa.array([b"1111", b"2222", b"3333"], pa.binary(4), _null_mask),
451451
id="binary4",
452452
),
453-
# `timestamp[s]` does not roundtrip for pyarrow.parquet: https://github.com/apache/arrow/issues/41382
454453
pytest.param(
455454
helpers.data_datetime("s"),
456455
id="datetime_s",
457-
marks=pytest.mark.xfail(
458-
reason="pyarrow.parquet does not support timestamp[s] roundtrips"
459-
),
460456
),
461457
pytest.param(
462458
helpers.data_datetime("ms"),
@@ -484,6 +480,16 @@ def test_simple_select(ctx, tmp_path, arr):
484480
batches = ctx.sql("SELECT a AS tt FROM t").collect()
485481
result = batches[0].column(0)
486482

483+
# pyarrow.parquet promotes timestamp[s] to timestamp[ms] on write
484+
# (https://github.com/apache/arrow/issues/41382). Compensate so the
485+
# comparison checks DataFusion reads what Arrow actually stored.
486+
if (
487+
isinstance(arr, pa.Array)
488+
and pa.types.is_timestamp(arr.type)
489+
and arr.type.unit == "s"
490+
):
491+
arr = arr.cast(pa.timestamp("ms"))
492+
487493
# In DF 43.0.0 we now default to having BinaryView and StringView
488494
# so the array that is saved to the parquet is slightly different
489495
# than the array read. Convert to values for comparison.

0 commit comments

Comments
 (0)