[FAQ Bot] NEW: Why does Spark write multiple parquet files after repartitioning a DataF by github-actions[bot] · Pull Request #238 · DataTalksClub/faq

github-actions · 2026-03-07T00:00:03Z

✨ FAQ NEW

Course: data-engineering-zoomcamp
Section: module-6 (Directly explains why repartitioning leads to multiple output files when writing parquet, fitting Spark-related questions in module-6.)
Related Issue: #237

Question

Why does Spark write multiple parquet files after repartitioning a DataFrame?

Decision Rationale

The proposal explains a Spark behavior (one output file per partition when writing) not explicitly covered by existing FAQs in module-6. It adds a clear explanation and example addressing why multiple parquet files appear after repartitioning.

Placement Details

Section ID: module-6
Sort Order: 60
Filename Slug: spark-write-multiple-parquet-per-partition

🤖 Generated by FAQ Bot

Closes #237

… a DataF

NEW: Why does Spark write multiple parquet files after repartitioning…

9e01aff

… a DataF

github-actions bot mentioned this pull request Mar 7, 2026

[FAQ] Why does Spark write multiple parquet files after repartitioning a dataset? #237

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FAQ Bot] NEW: Why does Spark write multiple parquet files after repartitioning a DataF#238

[FAQ Bot] NEW: Why does Spark write multiple parquet files after repartitioning a DataF#238
github-actions[bot] wants to merge 1 commit intomainfrom
faq-bot/issue-237

github-actions bot commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

github-actions bot commented Mar 7, 2026

✨ FAQ NEW

Question

Decision Rationale

Placement Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants