-
-
Notifications
You must be signed in to change notification settings - Fork 33.8k
Closed as not planned
Closed as not planned
Copy link
Labels
pendingThe issue will be closed if no feedback is providedThe issue will be closed if no feedback is providedstdlibStandard Library Python modules in the Lib/ directoryStandard Library Python modules in the Lib/ directory
Description
Bug report
Bug description:
The csv.Sniffer._guess_delimiter method currently uses data.split('\n') to split lines.
This prevents it from correctly processing CSV data that uses Classic Mac (\r) line endings.
The Issue
Because split('\n') ignores \r, valid multi-line data using CR is interpreted as a single line.
This causes Sniffer to fail (raising Could not determine delimiter) or, in cases of mixed line endings (e.g., concatenated streams), to silently detect the wrong delimiter.
Reproduction
import csv
# Scenario 1: Pure CR data (Classic Mac / Legacy Systems)
# Sniffer sees 1 line instead of 2. Fails to determine delimiter.
sample_cr = "Name,Age\rAlice,30"
try:
csv.Sniffer().sniff(sample_cr)
except csv.Error as e:
print(f"CR Failure: {e}")
# Scenario 2: Mixed endings (e.g. concatenated strings)
# Sniffer merges lines incorrectly, leading to SILENT DATA CORRUPTION.
# It detects '0' instead of ',' because ',' frequency becomes inconsistent.
sample_mixed = "User,ID\rAlice,001\nBob,002"
dialect = csv.Sniffer().sniff(sample_mixed)
print(f"Mixed Failure (Detected): {dialect.delimiter!r}")Proposed Fix
Replace data.split('\n') with data.splitlines(). This aligns Sniffer's behavior with csv.reader, which correctly handles universal newlines (\r, \n, \r\n).
I have implemented the fix and added regression tests locally. Submitting a PR shortly.
CPython versions tested on:
CPython main branch, 3.11
Operating systems tested on:
Windows
Linked PRs
Metadata
Metadata
Assignees
Labels
pendingThe issue will be closed if no feedback is providedThe issue will be closed if no feedback is providedstdlibStandard Library Python modules in the Lib/ directoryStandard Library Python modules in the Lib/ directory
Projects
Status
Done