Hot numeric kernels now specialize themselves at runtime — and your scientific computing can run in a browser tab.
Today we released SciRS2 0.4.1 — a patch release that hardens the just-in-time compilation layer in scirs2-core and ships a battle-tested WebGPU/WASM backend, on top of the 0.4.0 feature wave from ten days ago.
No C. No Fortran. No NumPy system dependencies. SciRS2 is a production-ready, pure-Rust replacement for the NumPy/SciPy/scikit-learn stack — and it compiles to a single static binary (or to WASM) that runs everywhere, from laptops to browsers to CUDA boxes. There is no pip install, no shared-library hunt, no glibc roulette. You ship one artifact and it runs.
Why SciRS2 0.4.1 matters
The Python numeric world has lived with the same compromise for a decade. Plain loops are slow, so you reach for Numba or NumExpr to bolt a JIT onto the runtime. GPU means CUDA, which means NVIDIA-only and a toolchain install. And “compute in the browser” has never really existed — you fall back to a server round-trip.
SciRS2 0.4.1 chips away at all three:
- Improved scirs2-core JIT compilation. The core’s just-in-time layer specializes hot numeric kernels at runtime — fusing element-wise operations, picking the right SIMD width for the host, and skipping interpreter and dispatch overhead. This release routes more operations through that path. Same public API, more throughput.
- 76-test WebGPU/WASM backend. Browser-side GPU compute is now a first-class, tested backend. Compile to
wasm32, run real numeric workloads against the GPU inside a browser or Node. - Distribution validation (78 tests, 15+ distributions). The statistical distributions are now verified to numerical accuracy against reference values, so the stats you build on are trustworthy.
All of this sits on top of the 0.4.0 release: Transformers and GANs in scirs2-neural, Bayesian MCMC/NUTS/HMC and survival analysis in scirs2-stats, NUFFT in scirs2-fft, constrained optimization, change-point detection, and more.
The scale is real, not aspirational. SciRS2 0.4.1 is 2.91M SLoC (precisely 2,908,818 lines across 7,640 files), 25,863 tests passing (out of 36,475 total #[test] annotations), spread across 32 workspace crates. Tagline: Production-Ready Pure Rust Scientific Computing • No System Dependencies • 10-100x Performance Gains. All 25,863 tests pass on Apple M3 (ARM64), Linux x86_64, and Linux+CUDA. Windows builds succeed with some test failures. Zero warnings, by policy.
Technical Deep Dive
Core JIT (scirs2-core). Think of what NumExpr and Numba do for Python, but native to the library and on by default. When a numeric kernel runs hot, the JIT layer specializes it for the actual data and host: adjacent element-wise ops get fused into a single pass over memory (one load, one store instead of many), the SIMD width is chosen for the CPU you’re actually on, and the per-element dispatch overhead disappears. 0.4.1 widens the set of operations that take this fast path — you get the speedup without touching your code.
GPU & WASM. The WebGPU backend (76 tests) brings GPU compute to the browser and to Node via WASM — the same compute path, just targeting wasm32 and the GPU exposed through WebGPU. On Linux+CUDA, the native CUDA backend accelerates eigensolvers and kernels for server-class workloads.
Scientific. Linear algebra is backed by OxiBLAS (pure-Rust BLAS/LAPACK), so SVD, eigendecomposition, and the rest run with no OpenBLAS/LAPACK/MKL anywhere in the build. FFTs — including the NUFFT (type 1/2/3) added in 0.4.0 — run on OxiFFT. Statistical distributions (15+) are now validated to numerical accuracy.
AI / ML. scirs2-neural carries the 0.4.0 deep-learning layer: Transformers (multi-head attention, positional encoding, encoder/decoder, layernorm) and a GAN framework (WGAN-GP, spectral norm, conditional GAN) — all pure Rust, all on the same numeric core.
Getting Started
Add it to your project:
cargo add scirs2
A first taste — pure-Rust SVD and a validated distribution:
use scirs2::prelude::*;
use ndarray::Array2;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let a = Array2::from_shape_vec((3, 3), vec![
1.0, 2.0, 3.0,
4.0, 5.0, 6.0,
7.0, 8.0, 9.0,
])?;
// OxiBLAS-backed SVD, pure Rust
let (_u, s, _vt) = scirs2::linalg::decomposition::svd(&a)?;
println!("Singular values: {:.4?}", s);
// Sample from a validated distribution
let normal = scirs2::stats::distributions::normal::Normal::new(0.0, 1.0)?;
let samples = normal.random_sample(5, None)?;
println!("Samples: {:.4?}", samples);
Ok(())
}
What’s New in 0.4.1
This is an incremental release that polishes the 0.4.0 line — honest and incremental, no API breakage.
- Version bump 0.4.0 → 0.4.1.
- JIT compilation improvements in
scirs2-core— enhanced just-in-time compilation infrastructure. - WebGPU/WASM backend with 76 tests — browser-side GPU compute.
- Distribution validation — 78 tests, 15+ distributions verified to numerical accuracy.
- 25,863 tests passing across 32 crates. Zero warnings.
Tips
-
Feed the JIT contiguous, fused work. Heavy element-wise pipelines benefit most from the improved JIT — keep operations chained and arrays contiguous so the kernel fuser has something to collapse into a single pass.
-
Run compute in the browser. Build for
wasm32and reach for the WebGPU backend to do real GPU compute in a browser or Node — no server round-trip. -
Trust the validated distributions. For stats work, lean on the now-validated 15+ distributions; their numerical accuracy is verified by the 78-test suite.
-
Drop-in upgrade from 0.4.0. Pin
scirs2 = "0.4.1"— there’s no API breakage, so the upgrade is mechanical.[dependencies] scirs2 = "0.4.1" -
Turn on the GPU on CUDA boxes. On Linux+CUDA, enable the GPU features to accelerate eigensolvers and kernels for server-class workloads.
This is the foundation
As of March 28, 2026, the COOLJAPAN ecosystem has had a busy month — and SciRS2 is the numeric and scientific bedrock beneath it. The just-shipped SkleaRS (classical ML), TenfloweRS (deep learning), and TrustformeRS (transformers) all stand on this same pure-Rust core, alongside ToRSh, OxiONNX, and OxiWhisper. When those libraries do linear algebra, FFTs, or statistics, they do it through SciRS2 — which means through OxiBLAS, OxiFFT, OxiCode, and oxiarc-*, with no C or Fortran in sight.
Repository: https://github.com/cool-japan/scirs
Star the repo if you want a scientific computing and AI stack that ships as one binary and runs anywhere.
Pure Rust scientific computing and AI is here — fast, safe, and sovereign.
— KitaSan at COOLJAPAN OÜ March 28, 2026