The inside story of building the largest pure-Rust sovereignty stack
159 posts
Complete, type-safe, memory-safe rewrite of the entire NVIDIA CUDA Toolkit in pure Rust. cuBLAS/cuDNN/cuFFT/cuSPARSE/cuSOLVER/cuRAND and more — all in 253k SLoC across 28 crates. Only runtime dependency is the NVIDIA driver. PTX codegen + autotuner, 7 GPU backends (Metal/Vulkan/WebGPU/ROCm/LevelZero). ≥90–95% of native CUDA performance. The sovereign GPU computing layer for SciRS2 and the entire COOLJAPAN ecosystem (now 21M+ SLoC total).
Pure Rust implementation of the eml(x, y) = exp(x) − ln(y) operator + constant 1 that expresses ALL elementary mathematical functions via uniform binary trees. Symbolic regression engine, lowering to efficient IR, code generation, CLI, SMT solving via OxiZ, SIMD/parallel batch evaluation via oxiblas-core. 173 tests, zero FFI. The sovereign mathematical foundation for SciRS2 and the entire COOLJAPAN ecosystem (now 26M+ SLoC total).
Complete GGUF loading + 25 quantized formats + OpenAI-compatible API server — all in pure Rust. 56.2k SLoC, 11 crates, no C/C++/Fortran, built on SciRS2/OxiBLAS/OxiFFT. ≥80% of llama.cpp throughput, WASM/GPU/Python bindings, LLaMA/Mistral/Gemma/Phi/LLaVA support. The sovereign LLM inference layer for SciRS2 and the entire COOLJAPAN ecosystem (now 21M+ SLoC total).
Patent-free, memory-safe multimedia and computer-vision framework in pure Rust. Full reconstruction of FFmpeg (codecs, containers, streaming) + OpenCV (detection, tracking, enhancement). 2.65M+ SLoC, 106 stable crates, GPU (wgpu), WASM, async pipelines, royalty-free codecs only. High-performance media processing without C dependencies. The sovereign media layer for SciRS2 and the entire COOLJAPAN ecosystem (now 21M+ SLoC total).
First patch on the pure-Rust NVIDIA CUDA Toolkit replacement: six new oxicuda-blas elementwise activations (HardSigmoid, HardSwish, Softplus, LeakyRelu, Ceil, Floor) plus substantial ROCm/Vulkan/WebGPU backend growth. ~248K lines across 28 crates.
High-performance IPFS-based decentralized content delivery network with dynamic rewards, bandwidth proofs, gamification, and anti-fraud. 198k+ SLoC pure Rust, Tauri desktop node, web creator portal, OxiARC compression. Node operators earn real rewards for bandwidth and storage. The sovereign content distribution layer for SciRS2 and the entire COOLJAPAN ecosystem (now 26M+ SLoC total).
An 8B-parameter language model at roughly 1 bit per weight, running from a single static Rust binary with no llama.cpp, no BLAS, no C/C++/Fortran. OxiBonsai 0.1.0 debuts sub-2-bit Pure Rust sovereign AI inference for the COOLJAPAN ecosystem — SIMD-accelerated, Rayon-parallel, and OpenAI-compatible out of the box.
OxiCUDA 0.1.0 is a pure-Rust, type-safe, memory-safe replacement for the entire NVIDIA CUDA Toolkit software stack — cuBLAS, cuDNN, cuFFT, cuSPARSE, cuSOLVER, cuRAND and more in ~239K lines across 28 crates. The only runtime dependency is the NVIDIA driver. PTX code generation plus a built-in autotuner, all from safe Rust.
SciRS2 0.4.2 — pure-Rust SciPy/NumPy/scikit-learn replacement, 2.94M SLoC, 27,632 tests, 29 crates. This release adds Neural Architecture Search, CMA-ES, Mamba SSM, async/unified GPU memory, H-matrix compression, streaming FFT, DLPack zero-copy, and Apache Iceberg/DataFusion IO. No C/Fortran.
The first public release of OxiPhysics: a unified, pure-Rust physics engine targeting Bullet (rigid body), OpenFOAM (CFD), LAMMPS (molecular dynamics), and CalculiX (FEM) in one workspace. 17 crates spanning collision, SPH/LBM fluids, FEM, MD, soft body, materials, and visualization — with zero todo!()/unimplemented!() stubs and no C or Fortran in default features.
OxiZ 0.2.0 ships an ergonomic EasySolver builder, no_std support for bare-metal/zkVM (RISC-V), a Pure-Rust ML heuristics crate (oxiz-ml), Skolemization, a modular WASM js_api, and 100% Z3 parity across 88 benchmarks. Zero C/C++.
OxiRS 0.2.4 ships a production-hardening pass for the Rust-native Semantic Web stack: a completed unwrap() audit, GPU feature-gating for Pure-Rust default builds, CLI flag fixes, and a fresh SciRS2 0.4 / OxiARC dependency refresh — 40,786 tests, zero warnings.
Production-ready, binary-level compatible drop-in replacement for Python Celery. 18 crates (100% complete), 4,075 tests, 10× throughput, type-safe macros, 5 brokers + 3 backends, Canvas workflows, DLQ, priority queues, task cancellation, and full observability. <50 MB memory, 10,000 tasks/sec per worker. The sovereign distributed task queue layer for SciRS2 and the entire COOLJAPAN ecosystem (now 26M+ SLoC total).
SciRS2 0.4.1 is a pure-Rust SciPy/NumPy/scikit-learn replacement: 2.91M SLoC, 25,863 tests, 32 crates. This release improves JIT compilation in scirs2-core, ships a 76-test WebGPU/WASM backend for browser GPU compute, and validates 15+ distributions to numerical accuracy. No C. No Fortran.
QuantRS2 0.1.3, the pure-Rust quantum framework — real KAK/ZYZ/holonomic decompositions, parameter-shift VQE gradients, ZX-calculus rewrites, SABRE routing, MPS tensor-network compression, ML circuit optimizers, PyO3 0.26, and a ~210-unwrap robustness pass.
High-performance ONNX runtime written entirely in pure Rust. Zero C/C++ dependencies, 147 operators fully supported, wgpu GPU acceleration, SIMD (AVX2/NEON), WASM + no_std ready, graph optimizer, async execution, model encryption. 30k+ SLoC, 590+ tests. The sovereign ONNX inference layer for SciRS2 and the entire COOLJAPAN ecosystem (now 21M+ SLoC total).
SciPy-compatible scientific computing and AI framework in 100% Pure Rust. 2.91M SLoC, 29 crates, 25,800+ tests. Flash Attention 2, LoRA/DoRA/GPTQ, ONNX export, GPU PDE/FFT/SpMV, Temporal GNNs, NeRF/instant-NGP, WebGPU backend, Delta Lake / Kafka I/O and more. 10–100× faster, zero system deps. The sovereign scientific computing and AI foundation for the entire COOLJAPAN ecosystem (now 21M+ SLoC total).
Production-grade pure Rust Text-to-Speech (TTS), Voice Recognition, and Sound framework. VITS + HiFi-GAN/DiffWave vocoders, real-time ≤0.05× RTF on GPU, streaming synthesis, SSML, 20+ languages, ONNX/Kokoro-82M support, SafeTensors checkpoints. Full integration with SciRS2/NumRS2. WASM, GPU (CUDA/Metal), Python/FFI bindings. The sovereign speech AI layer for the entire COOLJAPAN ecosystem (now 21M+ SLoC total).
Next-generation distributed mail server written entirely in Rust. Full SMTP/IMAP/JMAP/POP3 support, Mailet-based processing pipeline, AI integration (OxiFY), legal archiving (Legalis-RS), and AmateRS distributed storage. 16 crates, 1,942 tests, 10–50 MB memory footprint, >50,000 messages/second throughput. The sovereign mail infrastructure layer for SciRS2 and the entire COOLJAPAN ecosystem (now 21M+ SLoC total).
High-performance numerical computing library in pure Rust — the production-grade NumPy alternative. 222k+ SLoC, 4,704+ tests, 128+ SIMD-vectorized functions, N-dimensional arrays, advanced linalg via OxiBLAS, automatic differentiation, FFT, GPU (wgpu), Python bindings (PyO3), Arrow interop. 80–172% of OpenBLAS performance, zero C/Fortran deps. The sovereign numerical layer for SciRS2 and the entire COOLJAPAN ecosystem (now 21M+ SLoC total).
12 archive formats and 10 compression algorithms implemented from scratch in pure Rust. DEFLATE up to 400 MB/s, SIMD-accelerated CRC (3–4.5× faster), full async streaming, Brotli + Snappy support, LZH (Japanese legacy) full support. ~47k SLoC, 12 crates, 1,041 tests. The sovereign data packaging layer for SciRS2 and the entire COOLJAPAN ecosystem (now 21M+ SLoC total).
ToRSh is a PyTorch-compatible deep-learning framework in pure Rust with native tensor sharding. The 0.1.1 release hardens the 33-crate workspace onto consistent, published crates.io dependencies and adds the new torsh-convert model-converter CLI.
Production-grade machine learning optimization library with 22 optimizers (SGD→K-FAC), SIMD 2–4×, parallel 4–8×, GPU 10–50× acceleration. 251K+ SLoC, 7 crates, 1,220+ tests. Full extension of SciRS2-Core — no external deps allowed. The sovereign optimizer layer for SciRS2 and the entire COOLJAPAN ecosystem (now 21M SLoC total).
OxiArc 0.2.5 adds two brand-new compression crates — Brotli (RFC 7932, quality 0-11) and Snappy (block + framed with CRC32C) — bringing the workspace to 12 crates and 12 container formats. Plus DEFLATE/GZip/Zlib streaming with flush modes, LZW streaming, and an EntryBuilder fluent API. 1,038 tests, all passing. Pure Rust archive and compression — no C, no zlib, no libarchive.