·
13 commits
to master
since this release
What's Changed
- Move _normalize_hive_syntax to DefaultTypeConverter and fix pass stubs by @laughingman7743 in #695
- Always use C engine for CSV parsing and simplify engine selection logic by @laughingman7743 in #697
Bug Fixes
- Fix CSV parsing performance regression — PR #594 incorrectly forced the Python CSV engine for files over 50MB based on fabricated claims about pandas C parser int32 limitations. The C engine is now always used as the default (same as pandas' own default), restoring up to 28% faster CSV parsing. See #696 for details.
Internal
- Refactored type converter: moved
_normalize_hive_syntaxtoDefaultTypeConverterand cleaned up pass stubs. - Simplified CSV engine selection from 4 methods to 2 by removing dead code and inlining pyarrow compatibility checks.
Affected Versions
All versions from v3.17.0 through v3.30.0 (released 2025-08-09 to 2026-02-28) are affected by this performance regression. The last unaffected version is v3.16.0.
Workaround for older versions
If you cannot upgrade to v3.30.1, explicitly specify the C engine to bypass the incorrect engine selection:
cursor = connection.cursor(PandasCursor, engine="c")Full Changelog: v3.30.0...v3.30.1