PandRS 0.3.1 Released — Pure Rust Excel I/O Backed by OxiARC, Zero C Compression Libs

The DataFrame layer of the COOLJAPAN scientific stack just went a lot more Pure Rust.

Today we released PandRS 0.3.1 — a patch release that rips the hidden C-dependency chain out of DataFrame Excel and compression I/O and replaces it with an in-tree, OxiARC-backed xlsx engine.

No C. No Cython. No bundled zlib/xz/zstd C libs. No Python GIL.
Just a pandas-class DataFrame API — SIMD, parallel, and distributed — that compiles to a single static binary (or WASM) and runs everywhere, from laptops to edge devices to cloud clusters.

Why PandRS 0.3.1 matters

DataFrame Excel and compression stacks have a quiet C-dependency creep problem. Read an .xlsx file and you usually drag in zip, which drags in flate2, which drags in miniz_oxide or worse, a system zlib. Read a Parquet file and you can quietly pull zstd-sys, lz4-sys, and liblzma-sys (C libxz) through DataFusion and Arrow defaults. None of that shows up in your code — it shows up in your build, your supply chain, and your cross-compilation pain.

PandRS 0.3.1 closes those holes:

Pure Rust Excel. The excel feature no longer pulls zip, flate2, or miniz_oxide. xlsx reading and writing now run on an in-tree engine built on oxiarc-archive (Pure Rust ZIP) + quick-xml.
The dirs -sys crate is gone. Config-directory resolution is now an inline std::env-based implementation — dirs and dirs-sys removed.
liblzma, zstd-sys, and lz4-sys are pinned out of the default and most feature builds by switching datafusion, parquet, and arrow to default-features = false.

The public xlsx API is fully preserved — ExcelCell, ExcelCellFormat, NamedRange, and friends are all kept. This is a swap of the engine, not the interface.

Technical Deep Dive: A Pure Rust xlsx engine behind a preserved facade

1. The in-tree src/io/xlsx/ engine (OxiARC + quick-xml).
.xlsx is a ZIP container of XML parts. The old path used calamine (read) and simple_excel_writer (write), both of which lean on the zip/flate2/miniz_oxide C-flavored compression chain. We replaced both with a from-scratch reader/writer split across a new module — reader.rs, writer.rs, cell.rs, schema.rs, error.rs, mod.rs — built on oxiarc-archive for the Pure Rust ZIP layer and quick-xml for the XML. src/io/excel.rs is now a thin facade that forwards to crate::io::xlsx, so the public surface is unchanged. Advanced features behave as before; formula and named-range tracking is deferred to a follow-up. A new tests/excel_roundtrip_test.rs guards the write-then-read cycle. The net dependency change: oxiarc-archive 0.2.6 + quick-xml 0.39.2 added under excel; calamine and simple_excel_writer removed.

2. The -sys purge.
The dirs crate pulled dirs-sys for config-directory lookups. We replaced it with an inline user_config_dir() in src/config/loader.rs honoring XDG / macOS / Windows conventions, returning Option<PathBuf> with identical semantics. After this, cargo build --no-default-features has zero -sys crates outside the unavoidable OS-API set — the only survivor is core-foundation-sys, pulled by iana-time-zone/chrono for macOS timezone resolution, which is genuine OS FFI.

3. Pinning datafusion / parquet / arrow to drop C libs.

datafusion 53.1.0 with default-features = false drops the compression feature, eliminating liblzma-sys (C libxz) from distributed/flight/serving/all-features, plus the bzip2 / async-compression chain.
parquet 58.1.0 with default-features = false and [arrow, snap, brotli, flate2-zlib-rs, lz4, base64, simdutf8] — flate2-zlib-rs selects the Pure Rust zlib-rs, not miniz_oxide — eliminates zstd-sys from --features stable.
arrow 58.1.0 with default-features = false and [csv, ipc, json] guards against default drift re-introducing ipc_compression → zstd-sys/lz4-sys.
User-visible DataFusion, Parquet, and Arrow APIs are unchanged.

4. Honest tech debt and intentional regressions.
We are not pretending this is free. Two regressions are deliberate under the Pure Rust policy: zstd-compressed Parquet is no longer readable on --features stable/parquet (Snappy — the pandas default — gzip, brotli, and lz4 still work), and DataFusion’s built-in xz/bz2/zstd auto-decompression for CSV/JSON readers is disabled on distributed/flight/serving (plain + gzip still work). And some feature-gated debt remains upstream: parquet/distributed/flight still transitively pull flate2/lz4_flex/snap/brotli/miniz_oxide via Arrow/Parquet/DataFusion; --features distributed/flight still pull zstd-sys/miniz_oxide because DataFusion 53.1.0’s own Cargo.toml hardcodes default-features = true on parquet (Cargo features are additive — we can’t suppress upstream; this needs an upstream fix); and cloud-storage pulls ring (C+asm) via object_store 0.13.2. The default build pulls none of these.

Getting Started

cargo add pandrs --features excel

use pandrs::{DataFrame, Series};

fn main() -> pandrs::error::Result<()> {
    let mut df = DataFrame::new();
    df.add_column(
        "quarter".to_string(),
        Series::from_vec(vec!["Q1", "Q2", "Q3", "Q4"], Some("quarter")),
    )?;
    df.add_column(
        "revenue".to_string(),
        Series::from_vec(vec![120.5, 138.2, 151.0, 169.8], Some("revenue")),
    )?;

    // xlsx I/O is now Pure Rust — backed by OxiARC, no zip/flate2/miniz_oxide
    df.to_excel("report.xlsx", None)?;
    let reloaded = DataFrame::from_excel("report.xlsx", None)?;
    println!("round-tripped {} rows", reloaded.shape().0);
    Ok(())
}

What’s New in 0.3.1

Pure Rust Excel/xlsx (in-tree, OxiARC-backed): replaced simple_excel_writer + calamine with an in-tree xlsx reader/writer on oxiarc-archive + quick-xml; the excel feature no longer pulls zip, flate2, or miniz_oxide. Public xlsx API fully preserved. New src/io/xlsx/ module; src/io/excel.rs is now a thin facade. New round-trip test tests/excel_roundtrip_test.rs. Added oxiarc-archive 0.2.6 + quick-xml 0.39.2 under excel; removed calamine + simple_excel_writer.
-sys crate cleanup: replaced dirs with an inline std::env-based user_config_dir() in src/config/loader.rs (identical semantics) — removes dirs + dirs-sys. cargo build --no-default-features now has zero -sys crates outside the unavoidable OS-FFI set (core-foundation-sys for macOS timezone).
Pinned datafusion 53.1.0 default-features = false — drops compression, eliminating liblzma-sys (C libxz) and the bzip2/async-compression chain from distributed/flight/serving/all-features.
Pinned parquet 58.1.0 default-features = false with [arrow, snap, brotli, flate2-zlib-rs, lz4, base64, simdutf8] (Pure Rust zlib-rs) — eliminates zstd-sys from --features stable.
Pinned arrow 58.1.0 default-features = false with [csv, ipc, json] to guard against default drift re-introducing ipc_compression.
Dependency bumps: scirs2-core/stats/linalg 0.4.0 → 0.4.2; datafusion 53.0.0 → 53.1.0; tokio 1.50 → 1.52; rayon 1.11.0 → 1.12.0; rand 0.10.0 → 0.10.1; cranelift* 0.130.0 → 0.130.1; uuid 1.23.0 → 1.23.1; lru 0.16.3 → 0.17.0; toml 1.1.0 → 1.1.2; wasm-bindgen 0.2.114 → 0.2.118; js-sys/web-sys 0.3.91 → 0.3.95.
Fixed: pinned sha2 = "0.10" so Cargo resolves a digest 0.10.x shared with pbkdf2/aes-gcm (fixes a build error from sha2 0.10.9’s digest-contract shift); intra-doc link fix in src/io/excel.rs (rustdoc builds cleanly with -D warnings); refactored Excel I/O error handling and formatting (no behaviour change).
Intentional regressions (Pure Rust policy): zstd-compressed Parquet no longer readable on --features stable/parquet (Snappy/gzip/brotli/lz4 still work); DataFusion’s built-in xz/bz2/zstd auto-decompression for CSV/JSON readers disabled on distributed/flight/serving (plain + gzip still work).
Testing: 1809 tests passing (nextest, --all-features) and 117 doc tests passing. Zero clippy warnings with -D warnings. Rustdoc builds cleanly with -D warnings.

Tips

Turn on excel without guilt. Now that the path is Pure Rust, cargo add pandrs --features excel no longer drags in zip/flate2/miniz_oxide. The public xlsx API (ExcelCell, ExcelCellFormat, NamedRange, …) is unchanged, so existing code keeps compiling.
The default build is -sys-clean. Use cargo build --no-default-features for the leanest, most portable, easiest-to-cross-compile PandRS — zero -sys crates outside the unavoidable macOS timezone FFI.
If you need zstd Parquet, plan around it. It’s intentionally dropped on --features stable for Pure Rust purity. Use Snappy (the pandas default), gzip, brotli, or lz4, or pre-decompress your .parquet before loading.
cloud-storage still pulls ring (C+asm) via object_store 0.13.2 — that’s an upstream blocker, not a default. Leave it off unless you actually need cloud object stores.
Enable scirs2 for the scientific stack — it now rides scirs2-core/stats/linalg 0.4.2, keeping PandRS aligned with NumRS2 and SciRS2.
Watch the feature surface. --features distributed/flight still transitively pull a few C/compression crates because DataFusion hardcodes default-features = true on parquet upstream. If supply-chain purity is critical, stick to the default + parquet/stable set, which is clean.

This is the foundation

PandRS is the DataFrame layer of the COOLJAPAN scientific stack — it pairs with NumRS2 for arrays and SciRS2 for the broader scientific/AI primitives (now on 0.4.2). With 0.3.1, the data-loading floor of that stack leans on OxiARC for Pure Rust archive and compression — the same OxiARC ZIP and codec work that backs the rest of the ecosystem. Around it sit period-accurate siblings: OptiRS for optimization, SkleaRS for classical ML, TenfloweRS and TrustformeRS for deep learning, OxiMedia for media/CV, and the lower-level OxiFFT / OxiZ / OxiBLAS / OxiCode crates. The point is sovereignty: every layer compiles from source, with no C/Cython/bundled-codec baggage.

Repository: https://github.com/cool-japan/pandrs

Star the repo if you want a pandas-class DataFrame without the hidden C compression libs or the Python GIL.

The era of “pip install pandas” — dragging in zlib, xz, and zstd C libraries you never asked for — is ending.

Pure Rust DataFrames are here — fast, safe, and sovereign.

— KitaSan at COOLJAPAN OÜ April 19, 2026