Real-world benchmarks on Chicago Crime dataset (21.9MB, 146,574 rows):
| Metric | TurboXL | OpenPyXL | Improvement |
|---|---|---|---|
| Speed | 2.4s | 63.1s | 26.7x faster |
| Memory | 33.5MB | 66.9MB | 2.0x less |
| Throughput | 62,040 rows/sec | 2,321 rows/sec | 26.7x faster |
Dataset: Chicago Crimes 2025
π Recent Optimizations Implemented:
- zlib-ng integration - Up to 2.5x faster ZIP decompression
- Release build optimizations -
-O3 -march=native -fltofor GCC/Clang,/O2 /GL /arch:AVX2for MSVC - Arena-based shared strings - Memory-efficient string storage
- Chunked ZIP reading - 512 KiB buffer optimization
- β Read XLSX files and convert to CSV
- β Handle shared strings, numbers, dates, booleans
- β Process multiple worksheets
- β Memory-efficient streaming (33.5MB for 146k rows)
- β Cross-platform (Linux, macOS, Windows)
- β Write or modify XLSX files
- β Formula evaluation (uses cached values)
- β Charts, images, pivot tables
- β Password-protected files
import turboxl
# Convert first sheet
csv_data = turboxl.read_sheet_to_csv("data.xlsx")
# Convert specific sheet
csv_data = turboxl.read_sheet_to_csv("data.xlsx", sheet="Sheet2")
# Custom options
csv_data = turboxl.read_sheet_to_csv(
"data.xlsx",
sheet=0,
delimiter=";",
date_mode="iso"
)
# Save to file
with open("output.csv", "w", encoding="utf-8") as f:
f.write(csv_data)#include <xlsxcsv.hpp>
#include <iostream>
int main() {
try {
std::string csv = xlsxcsv::readSheetToCsv("data.xlsx");
std::cout << csv << std::endl;
} catch (const std::exception& e) {
std::cerr << "Error: " << e.what() << std::endl;
}
return 0;
}Install system dependencies (used via pkg-config/CMake):
# macOS (Recommended for best performance)
brew install libxml2 minizip-ng zlib-ng cmake pybind11 pkg-config
# Ubuntu/Debian (Recommended for best performance)
sudo apt-get install -y libxml2-dev libminizip-dev cmake build-essential pkg-config
# For zlib-ng on Ubuntu/Debian, build from source:
# git clone https://github.com/zlib-ng/zlib-ng.git
# cd zlib-ng && cmake -B build && cmake --build build -j && sudo cmake --install build
# Windows (vcpkg)
vcpkg install libxml2 minizip-ng zlib-ngPerformance Note: Installing zlib-ng provides significant performance improvements (up to 2.5x faster decompression). The build system automatically detects and uses zlib-ng if available, falling back to standard zlib otherwise.
Build the C++ core without Python bindings (no Python/pybind11 required):
# From repo root
cmake -S . -B build \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_TESTS=OFF \
-DBUILD_PYTHON=OFF \
-DBUILD_CLI=OFF
cmake --build build -j4Artifacts:
- Static library:
build/libturboxl_core.a
Build Modes:
- Release (Recommended): Enables
-O3 -march=native -fltooptimizations - Debug: Enables debugging symbols and assertions
BUILD_TESTS=ON/OFF- Build test suite (default: ON)BUILD_PYTHON=ON/OFF- Build Python bindings (default: ON)BUILD_CLI=ON/OFF- Build command-line tool (default: OFF)
TurboXL ships a PEP 517/518 build powered by scikit-build-core. The wheel builds the C++ core and Python extension in Release mode using CMake.
python3 -m pip install -U pip build scikit-build-core pybind11System dependencies listed above (libxml2, minizip-ng, zlib-ng, cmake, compiler) must be installed and discoverable by CMake/pkg-config.
# From repo root
python3 -m build -wOutputs go to dist/, for example:
dist/turboxl-0.1.0-<python>-<abi>-<platform>.whl
Install the built wheel locally:
pip install python/dist/turboxl-*.whlTips:
- Parallel CMake build:
CMAKE_BUILD_PARALLEL_LEVEL=4 python3 -m build -w - macOS arch (defaults to arm64 via
pyproject.toml): to override, you can pass--config-setting=cmake.define.CMAKE_OSX_ARCHITECTURES="arm64;x86_64"topython -m build.
- C++: C++20 compiler (GCC 10+, Clang 12+, MSVC 2019+)
- Build: CMake 3.20+
- Python: 3.8-3.12 (for Python bindings)
turboxl.read_sheet_to_csv(
xlsx_path: str,
sheet: Union[str, int] = None, # First sheet if None
delimiter: str = ",",
newline: Literal["LF", "CRLF"] = "LF",
include_bom: bool = False,
date_mode: Literal["iso", "rawNumber"] = "iso"
) -> strstruct CsvOptions {
std::string sheetByName;
int sheetByIndex = -1;
char delimiter = ',';
bool includeBom = false;
// ... more options
};
std::string readSheetToCsv(
const std::string& xlsxPath,
const CsvOptions& opts = {}
);MIT License - see LICENSE file for details.