A subtle dispatch bug, hunted down: OxiFFT’s SIMD fallback no longer recurses forever.
Today we released OxiFFT 0.1.3 — a correctness patch for the Pure Rust FFTW replacement that fixes an infinite-recursion bug on the SIMD fallback path.
No C. No Fortran. No FFI.
And now, no stack-overflowing recursion lurking in the fallback codelets.
OxiFFT remains a single Pure Rust crate that compiles to a static binary — or to WASM.
Why 0.1.3 matters
OxiFFT dispatches each transform size to a specialized SIMD codelet, with a scalar fallback for element types that aren’t f32 or f64. For the large radix sizes 512, 1024, and 4096, that fallback was wired wrong.
The fallbacks in notw_512_dispatch, notw_1024_dispatch, and notw_4096_dispatch called back into CooleyTukeySolver::execute — which re-entered the very same codelet dispatch they came from. For non-f32/f64 types the result was unbounded recursion and a stack overflow.
0.1.3 breaks the cycle: the fallback now calls CooleyTukeySolver::execute_dit_inplace directly, performing the iterative decimation-in-time transform without re-entering codelet dispatch. To make that possible, execute_dit_inplace was promoted to a public method on CooleyTukeySolver. The fix also drops a redundant output buffer allocation that the old fallback path no longer needs.
If your work uses extended-precision element types (for example, the f16-support / f128-support precisions) at sizes 512, 1024, or 4096, this is an important upgrade.
What’s New in 0.1.3
Fixed
- CUDA SIMD fallback infinite recursion: fixed unbounded recursion in the
notw_512_dispatch,notw_1024_dispatch, andnotw_4096_dispatchSIMD fallback paths.- The fallback for non-
f32/f64types previously calledCooleyTukeySolver::execute, which dispatched back into the same codelet — causing infinite recursion. - It now calls
CooleyTukeySolver::execute_dit_inplacedirectly to run the iterative DIT transform without re-entering codelet dispatch. execute_dit_inplacewas made public onCooleyTukeySolverto support this fix.- Removed an unnecessary
outputbuffer allocation in the fallback paths.
- The fallback for non-
Changed
- License consolidation: the dual
LICENSE-APACHE+LICENSE-MITfiles are consolidated into a singleLICENSEfile (Apache-2.0).
Getting Started
cargo add oxifft
The FFT API is unchanged — a forward then inverse 1D transform:
use oxifft::api::{fft, ifft};
use oxifft::Complex;
fn main() {
let input: Vec<Complex<f64>> = (0..16)
.map(|k| Complex::new((k as f64).sin(), 0.0))
.collect();
let spectrum = fft(&input); // time -> frequency
let recovered = ifft(&spectrum); // frequency -> time
let max_error: f64 = input
.iter()
.zip(&recovered)
.map(|(a, b)| (a.re - b.re).hypot(a.im - b.im))
.fold(0.0, f64::max);
println!("max roundtrip error: {max_error:.2e}");
}
Tips
- Upgrade if you transform at 512 / 1024 / 4096 with non-f32/f64 types. That’s the exact configuration the old fallback recursed on; 0.1.3 makes it terminate correctly.
f32andf64were never affected. Those types hit the real SIMD codelets, not the fallback — but upgrading is still recommended for everyone.execute_dit_inplaceis now public. If you reach intoCooleyTukeySolver, you can call the iterative in-place DIT path directly for non-f32/f64element types.- Note the license change. OxiFFT is now Apache-2.0 under a single
LICENSEfile; update any license metadata or audit tooling that expected the old dualLICENSE-APACHE/LICENSE-MITlayout.
This is the foundation
OxiFFT is the Pure Rust spectral layer for the SciRS2 numerical stack and the mandated replacement for rustfft, alongside SciRS2, NumRS2, OptiRS, PandRS, OxiBLAS, OxiCode, and same-day siblings OxiArc and OxiZ. Correctness patches like this one — chasing down a recursion bug on a rarely-trodden fallback path — are how a foundation earns trust.
Repository: https://github.com/cool-japan/oxifft
Star the repo if you want fast, safe FFTs without ever linking FFTW again.
Pure Rust spectral computing is here — fast, safe, and sovereign.
— KitaSan at COOLJAPAN OÜ February 12, 2026