COOLJAPAN
← All posts

OxiBLAS 0.1.0 Released — Pure Rust BLAS/LAPACK, the Foundation for SciRS2

The first public release of OxiBLAS — a pure Rust implementation of BLAS and LAPACK. Full Level 1/2/3 BLAS, LU/Cholesky/QR/SVD/EVD, 9 sparse formats, f16/f128 precision, and DGEMM already matching OpenBLAS on large matrices. No C, no Fortran, no MKL.

release oxiblas blas lapack pure-rust simd scientific-computing linear-algebra

The pure Rust linear algebra foundation has arrived.

Today we released OxiBLAS 0.1.0 — the first public release of a complete, pure Rust implementation of BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra PACKage).

No C. No Fortran. No external shared libraries. No FFI overhead. No build hell. Just clean, memory-safe linear algebra that compiles to a single static binary and runs everywhere — including WASM and no_std targets.

Why 0.1.0 matters

For decades, high-performance numerical computing in any language has meant linking against battle-tested but heavy C/Fortran libraries: OpenBLAS, Intel MKL, Apple Accelerate, Reference LAPACK. They are fast, but they bring real costs:

OxiBLAS exists to give the Rust ecosystem — and the imminent SciRS2 scientific computing stack — a sovereign mathematical foundation that needs none of that. It is the linear algebra backend SciRS2 is being built on top of, readied so that the rest of the ecosystem can stand on safe, portable, all-Rust numerics.

And as a first release, it is already competitive. On large matrices, OxiBLAS DGEMM matches OpenBLAS:

For a 0.1.0 written entirely in Rust intrinsics, that is a strong starting line.

Technical Deep Dive: How BLAS/LAPACK is rebuilt in pure Rust

OxiBLAS ships as a Cargo workspace of focused crates, re-exported through the unified oxiblas crate:

  1. Core (oxiblas-core) A custom SIMD abstraction over core::arch intrinsics with runtime feature detection — AVX-512F → AVX2/FMA → SSE4.2 on x86_64, NEON on AArch64, scalar fallback everywhere else. Plus extended-precision scalars (f16, f128 quad precision), complex types (Complex32/Complex64), arena allocation, 64-byte aligned vectors, and optional rayon parallelism.

  2. Matrix (oxiblas-matrix) The Mat / MatRef / MatMut types with column-major storage for BLAS/LAPACK compatibility, plus views, diagonals, and a lazy expression layer for operation fusion.

  3. BLAS (oxiblas-blas) Complete Level 1 (11 ops: dot, axpy, nrm2, scal, swap, copy, rot, iamax, asum…), Level 2 (15 ops: gemv, ger, symv, trsv, banded and packed variants…), and Level 3 (gemm, syrk, trsm, symm, hemm, herk…). GEMM uses BLIS-style MC×KC×NC blocking with SIMD micro-kernels and prefetching. Tensor extras include Einstein summation across 24 patterns.

  4. LAPACK (oxiblas-lapack) LU (with partial and full pivoting), Cholesky, LDL^T, QR (with column pivoting), SVD, symmetric/general eigenvalue decomposition, Schur and Hessenberg, triangular/general/tridiagonal solvers, least squares, condition estimation, and matrix inversion.

  5. Sparse (oxiblas-sparse) 9 sparse formats (CSR, CSC, COO, ELL, DIA, BSR, BSC, HYB, SELL-C-σ), 10 iterative solvers (CG/PCG, BiCGStab, GMRES, MINRES, IDR(s), TFQMR, QMR, Block-CG, Block-GMRES), Lanczos/Arnoldi/IRAM eigensolvers, and a deep preconditioner suite (Jacobi, ILU0/ILUT/ILUTP, IC0/ICT, AMG, SPAI, AINV, Schwarz).

Optional oxiblas-ndarray provides ndarray interop, and oxiblas-ffi exposes a C-ABI drop-in for existing BLAS/LAPACK call sites.

Getting Started

cargo add oxiblas

A first matrix multiply, straight from the prelude:

use oxiblas::prelude::*;

// C = A * B
let a = Mat::from_rows(&[
    &[1.0, 2.0, 3.0],
    &[4.0, 5.0, 6.0],
]);
let b = Mat::from_rows(&[
    &[7.0,  8.0],
    &[9.0, 10.0],
    &[11.0, 12.0],
]);
let mut c = Mat::zeros(2, 2);

gemm(1.0, a.as_ref(), b.as_ref(), 0.0, c.as_mut());
assert!((c[(0, 0)] - 58.0).abs() < 1e-10); // [[58, 64], [139, 154]]

Level 1 vector ops are just as direct:

use oxiblas_blas::level1::{axpy, dot};

let x = vec![1.0, 2.0, 3.0, 4.0];
let y = vec![5.0, 6.0, 7.0, 8.0];
assert_eq!(dot(&x, &y), 70.0); // 1*5 + 2*6 + 3*7 + 4*8

let mut y = vec![1.0, 2.0, 3.0, 4.0];
axpy(2.5, &[10.0, 20.0, 30.0, 40.0], &mut y); // y = 2.5*x + y

What’s inside

This first release lands with 469+ passing library tests and roughly 154,600 lines of Rust across 314 files.

Tips

This is the foundation

OxiBLAS 0.1.0 is the linear algebra bedrock of the COOLJAPAN ecosystem. It is purpose-built as the BLAS/LAPACK engine for SciRS2, whose own first release is imminent — the goal of shipping OxiBLAS first is so that SciRS2 and everything above it can rest on pure Rust numerics from day one. It joins early COOLJAPAN siblings already in the wild — VoiRS for audio, TenRSo and TensorLogic for tensors, Spintronics, and Oxicode — as part of a sovereign, C/C++/Fortran-free stack.

Repository: https://github.com/cool-japan/oxiblas

Star the repo if you want high-performance scientific computing without the traditional toolchain headaches.

The era of “just link OpenBLAS” is starting to end.

Pure Rust numerical linear algebra is here — and it’s already fast, safe, and sovereign.

KitaSan at COOLJAPAN OÜ December 28, 2025

↑ Back to all posts