OxiSound 0.2.0 — Pure-Rust Audio Device I/O for Playback and Capture

Getting sound out of a speaker — or into a microphone — should not require a C toolchain.

Today we’re releasing OxiSound 0.2.0 — the COOLJAPAN Pure-Rust audio device I/O layer: a cross-platform API for opening output and input streams over whatever the operating system happens to be running, with playback, capture, duplex, low-latency callbacks, MIDI, SMF, and OSC all in one place.

No PortAudio. No hand-rolled -sys glue against ALSA, CoreAudio, and WASAPI in three different #[cfg] arms. No build-time native dependency in your way. OxiSound speaks to the OS audio stack through a single, memory-safe Rust surface, and on every platform the right backend is selected for you.

Why OxiSound

Audio device I/O is the part of the audio stack that has resisted Rust the longest. Codecs and DSP can live happily in pure Rust, but the moment you want to actually hear something, you cross the operating-system boundary — and historically that meant linking against a native library: PortAudio, or the platform sound servers directly through ALSA on Linux, CoreAudio on macOS, and WASAPI on Windows. Each one has its own threading model, its own buffer semantics, its own ideas about sample formats, and its own way of going wrong.

That FFI boundary is exactly where memory-safety guarantees evaporate and where cross-compilation gets painful. OxiSound draws a clean line there. The OS-touching backends are isolated and auto-selected by target, the surface you program against is plain safe Rust, and the rest of your application never has to know which sound server is underneath.

OxiSound is the device I/O half of the COOLJAPAN audio story. Its sibling oxiaudio handles audio processing, DSP, and codecs; OxiSound handles getting the samples to and from the hardware. Keeping those concerns separate keeps both crates small and focused — you pull in only the layer you actually need.

What we built

OxiSound is a small workspace of focused crates, each one a clean layer:

oxisound-core — the foundational traits and types: AudioDevice, OutputStream, InputStream, StreamConfig, SampleFormat, and OxiSoundError. It is no_std-capable, so the core abstractions compile in constrained environments.
oxisound-cpal — the OS-backed implementation. It rides on cpal (itself pure Rust at the OS boundary) and auto-selects ALSA on Linux, CoreAudio on macOS, and WASAPI on Windows via cfg(target_os). There are no per-platform feature flags to set — the correct backend simply comes up.
oxisound-midi — MIDI input and output, backed by CoreMIDI, WinMM, and the ALSA sequencer depending on the platform.
oxisound-smf — a Standard MIDI File parser, writer, and playback iterator, with tempo-map handling so you can step through timed events.
oxisound-osc — Open Sound Control encode/decode plus a UDP transport for talking to synths and other audio tools over the network.
oxisound-session — iOS/macOS audio session management (AVAudioSession), opt-in and stubbed on non-Apple platforms.
oxisound-jack — the dedicated quarantine crate for the native JACK (libjack2) C-FFI, kept entirely out of the default facade.
oxisound — the public facade that ties it together, defaulting to the pure cpal backend.

The OS-backend crates sit behind COOLJAPAN’s OS-boundary governance exemption: ALSA, CoreAudio, and WASAPI are reached through cpal and selected automatically, with no feature flags exposed for them. Native JACK and PipeWire are opt-in and isolated, so the default build stays clean and self-contained.

Getting Started

Add OxiSound to your project — the default feature set is the pure cpal backend, so you get cross-platform playback and capture out of the box:

cargo add oxisound

[dependencies]
oxisound = "0.2.0"

Open the default output device and play two seconds through it:

// Play 2 seconds of silence through the default output device
let stream = oxisound::open_output(oxisound::StreamConfig::stereo_48k())?;
std::thread::sleep(std::time::Duration::from_secs(2));
stream.stop()?;

Want to actually feed samples in? The blocking write path is just as direct:

let config = oxisound::preferred_output_config(&oxisound::default_output()?)?;
let mut stream = oxisound::open_output(config)?;
stream.write(&samples)?;
stream.stop()?;

And if you need the lowest latency, the callback API hands you the buffer to fill in realtime:

oxisound::play_callback(config, |buf: &mut [f32]| {
    // fill buf in realtime — no allocations
})?;

What’s inside

Cross-platform playback and capture — output and input streams over ALSA (Linux), CoreAudio (macOS), and WASAPI (Windows), with the backend chosen automatically by target OS.
Device enumeration — enumerate_all_devices, default_output, default_input, and device_by_index to discover what’s available; preferred_output_config to ask a device for sane stream settings.
Three I/O styles — simple blocking write, simultaneous duplex_stream for input and output together, and a zero-allocation play_callback / duplex_callback path for the lowest latency.
Built-in test tones — sine_test_tone, white_noise_test, chirp_test_tone, and silence for verifying a signal path in one line.
Stream health monitoring — monitor_stream returns a guard that reports Healthy / Degraded / Disconnected status on an interval.
MIDI, SMF, and OSC — full MIDI I/O, Standard MIDI File parse/write/play, and OSC encode/decode over UDP, each behind its own feature flag.
no_std core — oxisound-core types and validation work without the standard library.

Tips

Leave the backend alone. ALSA, CoreAudio, and WASAPI are selected automatically by cfg(target_os) — there are deliberately no alsa/coreaudio/wasapi features. The default pure feature is all you need for native playback and capture.
Reach for the callback API when latency matters. play_callback and duplex_callback give you the realtime buffer directly; keep those closures allocation-free for the tightest, most predictable timing.
Validate a config before opening a stream. StreamConfig::stereo_48k() plus config.validate(...) (available even in the no_std core) catches bad settings before you touch a device. preferred_output_config is the easy way to start from device-friendly defaults.
Turn on only the protocols you use. midi, smf, and osc are separate feature flags, so a playback-only application stays lean. Add tokio when you want async streams and device hot-plug / auto-reconnect.
Need native JACK? It isn’t in the facade by design. Depend on oxisound-jack directly with its jack-backend feature — it’s the single quarantined C-FFI crate, kept out of the pure default build so the rest of your stack stays FFI-free.
Apple audio sessions live behind the session / macos-session features and are stubbed on non-Apple targets, so the same code compiles everywhere.

Part of the COOLJAPAN ecosystem

OxiSound is part of NoFFI — the COOLJAPAN initiative to replace every C/C++/Fortran/-sys FFI dependency in the Rust ecosystem with a clean, memory-safe, 100% Pure Rust implementation. OxiSound retires PortAudio and the raw ALSA / CoreAudio / WASAPI bindings for device I/O; its sibling oxiaudio covers codecs and DSP. Together they let you build a complete audio application — from microphone to processing to speaker — without a single native audio library in the build graph. Default features are 100% Pure Rust at the OS boundary: a self-contained binary, no system audio libraries to install, nothing for a C toolchain to compile.

Repository: https://github.com/cool-japan/oxisound

Star the repo if you believe getting audio in and out of a machine should be safe, portable, and toolchain-free. 🔊

Pure Rust audio device I/O — sovereign, safe, and FFI-free.

— KitaSan at COOLJAPAN OÜ June 22, 2026