COOLJAPAN
← All posts

OxiMedia 0.1.5 Released — Sovereign ML Pipelines on Pure-Rust OxiONNX

Pure-Rust FFmpeg+OpenCV replacement: OxiMedia 0.1.5 adds the oximedia-ml crate — typed ML pipelines (SceneClassifier, ShotBoundaryDetector, and more) on the Pure-Rust OxiONNX runtime, with a Python oximedia.ml submodule and an opt-in, symbol-free-by-default design. Plus a codec-decoder honesty pass. 108 crates, ~2.68M SLoC, 81,383 tests.

release oximedia machine-learning onnx oxionnx computer-vision pure-rust wasm

Media decoding, computer vision, and now machine-learning inference — all in one pure-Rust framework, with the ML layer entirely opt-in and zero ONNX symbols in a default build.

Today we released OxiMedia 0.1.5 — a release that brings sovereign ML pipelines to OxiMedia, layering a typed, Pure-Rust inference stack atop the OxiONNX runtime so you can classify scenes and detect shot boundaries without ever linking a C++ ML runtime.

No C. No C++. No FFmpeg binaries. No OpenCV. No ONNX Runtime C++. ML inference in OxiMedia is pure-Rust, powered by OxiONNX, and it is off by default — a default oximedia build links exactly zero ONNX symbols and stays C/Fortran-free. When you do opt in, CPU inference is fully pure-Rust; GPU backends are additive feature gates. The result still compiles to a single static binary (or to WASM), with zero unsafe in the ML layer.

Why OxiMedia 0.1.5 is a game changer

Bolting machine learning onto a media pipeline has historically meant dragging in ONNX Runtime — a heavy C++ dependency with its own toolchain, its own build headaches, and its own supply-chain surface. The moment you wanted to classify a frame or detect a cut, your “pure” project stopped being pure.

OxiMedia 0.1.5 removes that compromise. The new oximedia-ml crate is a typed, Pure-Rust ML layer built on the OxiONNX runtime (oxionnx 0.1.2). It classifies scenes and detects shot boundaries with no C++ runtime anywhere in the build. Because every piece is gated behind feature flags, the default build stays lean and C/Fortran-free — you only pay for inference when you ask for it. CPU inference is fully pure-Rust via OxiONNX, and GPU backends (CUDA, WebGPU, DirectML) are purely additive.

This release also ships a credibility win that has nothing to do with new features: a codec decoder honesty pass. Several decoders (AV1, VP9, VP8, Theora, Vorbis, AVIF) previously carried “Stable”/“Complete” labels even though they parse bitstreams without yet fully reconstructing pixel or sample data end-to-end. They are now accurately relabelled Bitstream-parsing. There is no source behaviour change — the decoders parse exactly as before — only honest documentation, backed by a new per-decoder status report in docs/codec_status.md.

Technical Deep Dive

1. The oximedia-ml core. At the heart of the new crate sit a small set of typed building blocks: OnnxModel (a thin wrapper over an OxiONNX Session), ModelCache (a concurrent Arc<Mutex<_>> map with optional LRU capacity), and the TypedPipeline trait (with Input/Output associated types and a process() method). Device selection runs through DeviceType with a DeviceType::auto() runtime probe spanning Cpu, Cuda, WebGpu, DirectMl, and CoreMl. An ImagePreprocessor handles ImageNet mean/std normalization, NCHW/NHWC layouts, and letterbox/resize-to-fit, while postprocess helpers cover softmax, sigmoid, argmax, and top_k. A ModelZoo registry scaffold rounds it out.

2. The shipped pipelines. SceneClassifier is a Places365/ImageNet-style typed pipeline on OxiONNX: ImageNet-normalized 224x224 NCHW input, a configurable top_k, and softmax → top-K postprocessing, with from_model, from_path, and with_top_k constructors. ShotBoundaryDetector is TransNetV2-compatible: a 48x27 NCHW rolling window of frames feeds a many-hot output for hard and soft cuts, returning a Vec<ShotBoundary { frame_index, confidence, kind: Hard | SoftCut }> with configurable window length and threshold. Behind the all-pipelines facade you also get AestheticScorer (NIMA, 224x224 → AestheticScore), ObjectDetector (YOLOv8, 640x640 → Vec<Detection> with NMS, 80 COCO classes), and FaceEmbedder (ArcFace, 112x112 face → 512-dim FaceEmbedding).

3. Facade feature gating. The oximedia facade exposes a new oximedia::ml module re-exporting oximedia-ml behind features = ["ml"], with sub-features ml-scene-classifier, ml-shot-boundary, and ml-onnx for selective inclusion. The oximedia-ml crate itself gates on onnx, cuda, webgpu, directml, scene-classifier, shot-boundary, and all-pipelines. The default build remains symbol-free; the full feature now also picks up ml + ml-scene-classifier + ml-shot-boundary.

4. Python and decoder transparency. A new oximedia.ml PyO3 submodule (gated on oximedia-py/ml) exposes the full typed pipeline stack to Python with numpy I/O — (H,W,3) uint8 for image pipelines and (N,H,W,3) uint8 for the shot-boundary window — including MlDeviceType, OnnxModel, MlModelZoo, SceneClassifier, ShotBoundaryDetector, AestheticScorer, ObjectDetector, and FaceEmbedder. Separately, decoders now sort into a four-tier taxonomy — Verified / Functional / Bitstream-parsing / Experimental — documented in the top-level README, oximedia-codec/README.md, and docs/codec_status.md.

Getting Started

Base install:

cargo add oximedia

Enable the ML layer with the features you need:

[dependencies]
oximedia = { version = "0.1.5", features = ["ml", "ml-scene-classifier", "ml-onnx"] }

Then run a typed scene-classification pipeline on OxiONNX:

use oximedia::ml::pipelines::{SceneClassifier, SceneImage};
use oximedia::ml::{DeviceType, TypedPipeline};

let classifier = SceneClassifier::load("places365.onnx", DeviceType::auto())?;
let image = SceneImage::new(rgb_bytes, 224, 224)?;
for pred in classifier.run(image)? {
    println!("class {} -> {:.3}", pred.class_index, pred.score);
}

DeviceType::auto() probes for the best available backend at runtime, so the same code path runs on CPU (fully pure-Rust) or on a GPU backend you opted into.

What’s New in 0.1.5

This release is 108 crates, roughly 2,677,000 SLoC of Rust, with 81,383 tests passing (0 failures, 245 skipped) via cargo nextest run --workspace --all-features, zero clippy warnings, Apache-2.0, MSRV Rust 1.85+. All 108 crates are Stable.

Tips

Part of the COOLJAPAN ecosystem

OxiMedia 0.1.5 stands on Pure-Rust COOLJAPAN foundations: OxiONNX (oxionnx) for sovereign ONNX inference, SciRS2 (scirs2-core) for tensor and signal math, OxiFFT (oxifft) for transforms, and OxiARC (oxiarc-archive) for compression. Every one of these is a real dependency — no C, no C++, no Fortran in the default build.

Repository: https://github.com/cool-japan/oximedia

Star the repo if a single pure-Rust framework for media, computer vision, and ML — with inference you can switch off entirely — is something you want to see thrive.

Pure Rust media, computer vision, and ML is here — fast, safe, and sovereign.

KitaSan at COOLJAPAN OÜ April 21, 2026

↑ Back to all posts