COOLJAPAN
← All posts

PandRS 0.1.0 Released — A Pure Rust DataFrame Library with Pandas-Class APIs, First Stable Release

PandRS 0.1.0, the first stable release — a 100% Pure Rust DataFrame library with 70+ pandas-compatible methods, SIMD vectorization, Rayon parallelism, and up to 89% memory reduction. 1334+ tests, 175,000+ lines of Rust. No C, no Cython, no GIL.

release pandrs dataframe pandas data-analytics simd parallel-computing pure-rust

A pandas-class DataFrame for Rust — without the C extensions, the Cython, or the Python GIL that have shadowed data analysis in Python for over a decade.

Today we released PandRS 0.1.0 — the first stable release of a high-performance, Pure Rust DataFrame library with a pandas-class API, SIMD optimization, parallel processing, and an on-ramp to distributed computing.

No C. No Cython. No pandas/NumPy C-extensions. No Python GIL. The DataFrame world has been built for years on pandas and its native lineage — Cython-compiled hot loops, NumPy’s C core, and a global interpreter lock that quietly caps how far parallelism can go. That stack is powerful, but it is also why “just install pandas” can turn into wheel-and-compiler trouble, and why scaling out so often means working around the GIL. PandRS takes a different path: it is plain Rust, it compiles to a single static binary (or WASM), and cargo add pandrs needs no Python, no system libraries, and no build toolchain beyond the Rust compiler. PandRS reached this first stable release after a long alpha and beta gestation that began back in spring 2025 — this is the point where it became something we were ready to call solid.

Why PandRS 0.1.0 matters

If you have ever shipped a pandas pipeline, you know the friction: the GIL that keeps real parallelism just out of reach, C-extension wheels that fail to build on the one machine that matters, memory bloat on wide string-heavy frames, and no honest story for WASM or embedded targets. This first stable release matters because it removes that whole category of problem while keeping the API familiar — and the benchmarks are real.

Measured against pandas (Python) on an AMD Ryzen 9 5950X with 64GB RAM and NVMe storage:

These are the real numbers from this release. We would rather under-promise and let the benchmarks and the test suite speak: 1334+ tests pass with --all-targets --all-features, with zero clippy warnings under -D warnings.

Technical Deep Dive: how PandRS is built

PandRS is 175,000+ lines of Rust organized into a few honest layers.

1. Columnar core. The data model is built from Series, DataFrame, MultiIndex, and Categorical, backed by columnar storage. The columnar layout is what makes vectorized operations and string pooling possible, and it is the reason the 70+ pandas-compatible methods can target “100% Pandas API compatibility” for the core surface while running on a Rust foundation rather than a Python one.

2. Performance internals — SIMD + Rayon. Numeric kernels use automatic SIMD vectorization, and data-parallel work fans out across cores with Rayon. Columnar storage plus string pooling keeps memory tight, and lazy evaluation lets chains of operations fuse instead of materializing every intermediate frame. There is no GIL to step around, so parallelism is the default rather than the exception.

3. Modular helper layout. The method families live in focused helper modules so the codebase stays under control and easy to navigate: helpers/window_ops.rs (rolling and expanding windows, ewm), helpers/string_ops.rs (the str_* family), helpers/math_ops.rs, helpers/aggregations.rs (groupby aggregations), and helpers/comparison_ops.rs. Concretely that means window functions (rolling_mean/sum/var/median, expanding_*, ewm), groupby (groupby/agg/transform/groupby_apply), rich statistics (describe, corr/cov, geometric_mean, trimmed_mean), string ops (str_contains/str_replace/str_split and friends), and missing-data handling (fillna/ffill/bfill/dropna/isna).

4. I/O and optional feature surface. Out of the box PandRS reads and writes CSV (parallel reader/writer), Parquet (with compression), JSON (records and columnar), and Excel (XLSX/XLS), with SQL (PostgreSQL/MySQL/SQLite) behind the sql feature and zero-copy Arrow interop. Beyond that, optional features cover distributed (DataFusion), GPU (CUDA), JIT (Cranelift), visualization (text-based plus plotters), streaming, model serving, and WASM — each one opt-in, so the default build stays lean and Pure Rust.

Getting Started

cargo add pandrs
use pandrs::{DataFrame, Series};

fn main() -> pandrs::error::Result<()> {
    let mut df = DataFrame::new();
    df.add_column(
        "name".to_string(),
        Series::from_vec(vec!["Alice", "Bob", "Carol"], Some("name")),
    )?;
    df.add_column(
        "age".to_string(),
        Series::from_vec(vec![30, 25, 35], Some("age")),
    )?;
    df.add_column(
        "salary".to_string(),
        Series::from_vec(vec![75000.0, 65000.0, 85000.0], Some("salary")),
    )?;

    // Filter rows with a string predicate, then aggregate a column
    let adults = df.filter("age > 25")?;
    let mean_age = df.column("age")?.mean()?;
    println!("{} adults, mean age {:.1}", adults.shape().0, mean_age);
    Ok(())
}

What’s inside

Tips

This is the foundation

PandRS 0.1.0 ships as part of a brand-new Pure Rust scientific stack. It lands the very same day as NumRS2 (the NumPy layer) and OptiRS (the optimizer layer), one day after SciRS2 (0.1.0, 2025-12-29) laid down the foundation. Three sovereign layers arriving together — DataFrame, NumPy, and optimizer — alongside the SciRS2 base. PandRS is the DataFrame core of that stack: deeper integration across these projects is on the roadmap, but from day one it stands on its own as a fast, native DataFrame engine.

Repository: https://github.com/cool-japan/pandrs

Star the repo if a Pure Rust, pandas-class DataFrame — without the C, the Cython, or the GIL — is something you have been waiting for.

Pure Rust DataFrames are here — fast, safe, and sovereign.

KitaSan at COOLJAPAN OÜ December 30, 2025

↑ Back to all posts