COOLJAPAN
← All posts

OxiFFT 0.3.2 Released — Multi-Rank 3D Pencil FFT + a Default-Off AVX-512 Gate

Pure Rust FFT and the rustfft replacement. OxiFFT 0.3.2 completes multi-rank MPI execution for the 3D pencil decomposition, gates AVX-512 codelets behind a default-off feature so the build stays clean on rustc 1.95 stable, hardens ND-plan error handling, and refreshes the OxiCUDA GPU backend.

release oxifft fft mpi avx512 simd rust rustfft

Distributed 3D FFT now scales across many ranks — and the default build stays warning-free on the latest stable Rust.

Today we released OxiFFT 0.3.2 — a hardening release that finishes multi-rank 3D pencil FFT execution and gates AVX-512 codelets behind a default-off feature so OxiFFT keeps building cleanly on rustc 1.95 stable.

No C. No Fortran. No FFTW. No FFI. OxiFFT is Pure Rust to the core, with default features that are 100% Rust — it compiles to a single static binary or to WASM, and displaces FFTW3 and rustfft for in-process transforms (and FFTW-MPI for distributed ones). The spectral backbone for the SciRS2 signal and audio stack carries no native baggage.

Why 0.3.2 matters

This is a maintenance and hardening release, focused on finishing distributed 3D FFT at scale and keeping the default build pristine on the newest stable toolchain.

Technical Deep Dive

Multi-rank 3D pencil FFT

plan_3d_pencil now supports multi-rank MPI execution with full forward/inverse pencil decomposition. Where a slab decomposition splits the volume along a single axis — capping the usable rank count at the number of slabs — pencil decomposition gives each rank a “pencil” through the volume, so 3D transforms scale to far higher rank counts before communication dominates.

Alongside the executor work, plan_nd error handling for ND FFT plans was expanded, and the error handling for the row and column pools used in the pencil path was refactored into cleaner, more predictable paths. The relevant code lives in oxifft/src/mpi/plans/plan_3d_pencil.rs and oxifft/src/mpi/plans/plan_nd.rs.

AVX-512 feature gate

rustc 1.95 stable treats #[target_feature(enable = "avx512*")] as unstable (rust-lang/rust#44839). To keep the default build clean, AVX-512 codelets and their runtime dispatchers are now gated behind a new default-off avx512 feature.

Without --features avx512, builds simply fall through to the existing AVX-2 / SSE / scalar dispatch paths — so the default build stays warning-free on stable, with no loss of the fast path that most CPUs actually use. There is no API or ABI change when the feature is enabled; both oxifft and oxifft-codegen-impl were updated in concert. The gating touches oxifft/src/dft/codelets/simd/mod.rs and oxifft-codegen-impl/src/gen_simd/avx512.rs.

GPU backend refresh

The optional OxiCUDA GPU backend was bumped 0.1.4 → 0.1.8 across four incremental updates (0.1.5, 0.1.6, 0.1.7, 0.1.8). This is a straightforward dependency refresh of the GPU stack — no changes to the OxiFFT API surface.

Getting Started

cargo add oxifft

A minimal forward FFT:

use oxifft::{Complex, Direction, Flags, Plan};

let plan = Plan::dft_1d(1024, Direction::Forward, Flags::MEASURE)
    .expect("1024-pt plan");
let input = vec![Complex::new(1.0_f64, 0.0); 1024];
let mut output = vec![Complex::new(0.0_f64, 0.0); 1024];
plan.execute(&input, &mut output);

AVX-512 is opt-in. On a toolchain where AVX-512 target features are allowed (nightly or otherwise unstable-friendly), enable it explicitly:

cargo add oxifft --features avx512

What’s New in 0.3.2

Tips

The foundation

OxiFFT is the spectral layer of the COOLJAPAN ecosystem. By late May 2026 it sits beside mature siblings — SciRS2, NumRS2, OxiBLAS, OxiCUDA (its GPU backend), ToRSh, OxiWhisper, SkleaRS, TenfloweRS, TrustformeRS, and OxiPhysics — every transform staying Pure Rust from a single laptop to an MPI cluster. The OxiCUDA 0.1.8 dependency in this release is exactly that integration in practice: an optional, Pure-Rust-by-default GPU path that you bring in only when you want it.

Repository: https://github.com/cool-japan/oxifft

Star the repo if Pure Rust spectral computing belongs in your stack — and tell us how OxiFFT scales on your cluster.

Pure Rust spectral computing — fast, safe, and sovereign, from a laptop to an MPI cluster.

KitaSan at COOLJAPAN OÜ May 22, 2026

↑ Back to all posts