10 posts
OxiCUDA 0.2.0, the pure-Rust replacement for the NVIDIA CUDA Toolkit, lands the 'Wave AAA+64' expansion: adaptive RK45 with Richardson extrapolation, Extended Persistence and Discrete Morse theory (oxicuda-tda), Parametric UMAP, and Fisher Information estimation — plus a workspace-wide zero-unwrap reliability pass and 32,320 passing tests. ~783K lines across 73 crates. No CUDA SDK, no nvcc.
Pure-Rust CUDA Toolkit replacement: a maintenance release with numerical-stability refinements in the HMC variational sampler, stream-ordered allocator tuning, and TriMap reduction polish — 23,535 passing tests. No CUDA SDK, no nvcc.
Pure-Rust replacement for the entire NVIDIA CUDA Toolkit. 0.1.7 adds a SYR2K Tensor Core kernel (fused A×Bᵀ + B×Aᵀ rank-2k update) to oxicuda-blas, cross-subsystem CUDA kernel enhancements, and Multi-Operation Scheduling improvements. No CUDA SDK, no nvcc, no C/C++ toolchain.
Pure-Rust replacement for the NVIDIA CUDA Toolkit. OxiCUDA 0.1.6 adds a Tensor Core fast path for SYRK in oxicuda-blas and sixteen new ML crates (adversarial, SSL, continual, multimodal, 3D geometry, PINN, ANN, anomaly, causal, meta, MoE, NeRF, quantum, recsys, RLHF, tabular). No CUDA SDK, no nvcc.
The pure-Rust NVIDIA CUDA Toolkit replacement adds nine new GPU deep-learning crates — generative diffusion, graph neural nets, Mamba SSMs, vision transformers, audio/speech, time-series, Bayesian DL, federated learning, and NAS — growing to ~320K lines across 37 crates with 9,568 passing tests. No CUDA SDK, no nvcc.
A small maintenance release for OxiCUDA, the pure-Rust replacement for the NVIDIA CUDA Toolkit. Workspace-wide documentation and quality improvements, with all 28 crates aligned to 0.1.4 so the stack ships in lockstep. The only runtime dependency is the NVIDIA driver.
A quality-and-docs maintenance release for the pure-Rust NVIDIA CUDA Toolkit replacement — workspace-wide polish, internal version alignment to 0.1.3, and continued growth to ~260K lines of safe Rust across 28 crates. The only runtime dependency is still the NVIDIA driver.
Complete, type-safe, memory-safe rewrite of the entire NVIDIA CUDA Toolkit in pure Rust. cuBLAS/cuDNN/cuFFT/cuSPARSE/cuSOLVER/cuRAND and more — all in 253k SLoC across 28 crates. Only runtime dependency is the NVIDIA driver. PTX codegen + autotuner, 7 GPU backends (Metal/Vulkan/WebGPU/ROCm/LevelZero). ≥90–95% of native CUDA performance. The sovereign GPU computing layer for SciRS2 and the entire COOLJAPAN ecosystem (now 21M+ SLoC total).
First patch on the pure-Rust NVIDIA CUDA Toolkit replacement: six new oxicuda-blas elementwise activations (HardSigmoid, HardSwish, Softplus, LeakyRelu, Ceil, Floor) plus substantial ROCm/Vulkan/WebGPU backend growth. ~248K lines across 28 crates.
OxiCUDA 0.1.0 is a pure-Rust, type-safe, memory-safe replacement for the entire NVIDIA CUDA Toolkit software stack — cuBLAS, cuDNN, cuFFT, cuSPARSE, cuSOLVER, cuRAND and more in ~239K lines across 28 crates. The only runtime dependency is the NVIDIA driver. PTX code generation plus a built-in autotuner, all from safe Rust.