Bug report
Bug description:
I know that there were discussions about rewriting CSV Sniffer, and they seem to be on hold, to avoid breaking any code relying on its behavior.
I believe it's possible, though, to still make it significantly faster.
The current code iterates over 127 ASCII characters and counts their occurences on each line, even if they're not present:
It's highly inefficient. We can count only present characters, and backfill zeros.
CPython versions tested on:
3.15
Operating systems tested on:
macOS
Linked PRs
Bug report
Bug description:
I know that there were discussions about rewriting CSV Sniffer, and they seem to be on hold, to avoid breaking any code relying on its behavior.
I believe it's possible, though, to still make it significantly faster.
The current code iterates over 127 ASCII characters and counts their occurences on each line, even if they're not present:
cpython/Lib/csv.py
Line 382 in b36d23f
It's highly inefficient. We can count only present characters, and backfill zeros.
CPython versions tested on:
3.15
Operating systems tested on:
macOS
Linked PRs
csv.Sniffer.sniff()delimiter detection 1.6x faster #137628