OxiGAF 0.1.1 Released — From a Gaussian-Splatting Core to a Full Pure-Rust Avatar Studio

The Gaussian avatar core just grew an entire studio around it.

Today we released OxiGAF 0.1.1 — a sweeping feature release that turns last winter’s monocular-video Gaussian-avatar reconstructor into a full pure-Rust avatar studio, with rigging, expressions, a production render pipeline, an extended diffusion sampler suite, and a dramatically larger CLI.

No C. No C++. No Fortran. No PyTorch, no CUDA-only kernels you can’t read, no Python you have to ship a virtualenv for. OxiGAF reconstructs an animatable 3D Gaussian avatar from a single casual video and compiles to a single static binary — the GPU paths run through wgpu on Metal, Vulkan, and DirectX alike, and the whole thing stays inside the COOLJAPAN ecosystem (oxiarc-archive instead of zip, ToRSh tensors via the bridge, no openblas).

Why OxiGAF 0.1.1 is a game changer

The 0.1.0 release proved the hard part: a fully differentiable Gaussian-Splatting rasterizer, multi-view diffusion, and FLAME binding that trains end-to-end. But a research core is not a studio. If you wanted to actually use an avatar — rig it, drive it with speech, light it, denoise it, export it to glTF — you were on your own.

0.1.1 closes that gap on every front:

It rigs and animates. A new avatar rig, gaze controller, and head tracker sit on top of a full expression system: FACS action-unit coefficients, expression transfer and clustering, emotion recognition, and phoneme-driven animation. The face is no longer a static reconstruction — it performs.
It renders like a real engine. A complete post-processing stack lands: SSAO, bloom, depth-of-field, motion blur, HDR tone mapping, color grading, TAA, film grain, chromatic aberration, and lens distortion — plus volumetric rendering, a render graph, stereo (side-by-side / top-bottom) output, and BVH/LOD spatial acceleration.
It samples better. The diffusion side gains a real sampler suite — DDPM, adaptive sampling, consistency models, flow matching, guidance rescaling — alongside LoRA and ControlNet adapters, KV-cache and fused attention, and SDEdit-style in-context image editing.
It trains seriously. Curriculum learning, progressive training, few-shot adaptation, MAML meta-learning, and continual learning join gradient surgery, OHEM, EMA, SWA, and mixed precision.
It ships a toolbox. The CLI explodes from six subcommands into a full kit: PLY/glTF/mesh/point-cloud/video/animation export, a scene analyser and model inspector, scene merging and streaming, a live dashboard, and HTML experiment reports.

All of it under the same discipline as 0.1.0: 12,537 tests passing, zero unwrap(), every file under 2,000 lines, and platform-specific GPU dependencies feature-gated so the default build stays 100% Pure Rust.

Technical Deep Dive: the four layers that grew

OxiGAF’s workspace is a clean stack of crates, and 0.1.1 deepened each one.

oxigaf-flame — the head model became a face engine. Beyond the original Linear Blend Skinning and safetensors I/O, it now carries a mesh-processing suite (repair, smoothing, Loop and Catmull-Clark subdivision, morphing), geometry tools (geodesic distance, spectral analysis, statistical shape models, symmetry detection), a UV & texture pipeline (parameterisation, texture baking, face atlas, albedo, SH lighting), and the rig/expression system that drives all of it.
oxigaf-render — from rasterizer to render pipeline. The differentiable 3D Gaussian Splatting rasterizer (still with its full backward pass) is now wrapped in a render graph: post-processing, volumetric ray-marching, scene/image compositing, silhouette extraction, background synthesis, MIP splatting, interactive Gaussian picking, and GPU tooling (profiler, debug readback, render metrics).
oxigaf-diffusion — controllable multi-view generation. The multi-view U-Net with cross-view attention gains the sampler suite, identity/avatar conditioning, classifier guidance, ControlNet and LoRA adapters, latent-space tooling (blending, interpolation, latent walks, DDIM inversion), and CLIP scoring for evaluation.
oxigaf-trainer + oxigaf-cli — the operational layer. The trainer adds advanced loss functions (adaptive, contrastive, multi-resolution), the learning regimes above, and full session management (checkpoint manager, profiler integration, callbacks). The CLI turns that into a usable workflow with export, analysis, visualisation, and reporting. A new unified pipeline module in the oxigaf meta-crate orchestrates training → rendering → export end to end.

Under the hood, 0.1.1 also moved the whole stack forward: nalgebra 0.34→0.35, glam 0.32→0.33, the candle family 0.9→0.10, safetensors 0.7→0.8, wgpu 28→29, and the ToRSh tensor bridge (torsh-core/torsh-tensor/torsh-nn) 0.1.0→0.1.2, with kiddo 5 added for spatial queries.

Getting Started

cargo add oxigaf

A minimal multi-view generation pass with the new sampler defaults:

use oxigaf_diffusion::{MultiViewDiffusionPipeline, DiffusionConfig};
use candle_core::Device;
use std::path::Path;

fn main() -> anyhow::Result<()> {
    // Configure multi-view generation with classifier-free guidance
    let config = DiffusionConfig {
        num_views: 4,
        guidance_scale: 7.5,
        num_inference_steps: 50,
        ..Default::default()
    };

    // Load the complete pipeline (Metal / Vulkan / CUDA via wgpu + candle)
    let device = Device::cuda_if_available(0)?;
    let pipeline = MultiViewDiffusionPipeline::load(config, Path::new("weights/"), &device)?;

    // Generate consistent views from one frame, conditioned on camera poses
    let output = pipeline.generate(&input_image, &camera_poses)?;
    Ok(())
}

Process a captured video as a FLAME sequence with automatic LRU caching:

use oxigaf_flame::FlameSequence;
use std::path::Path;

let mut sequence = FlameSequence::from_json(Path::new("sequence.json"))?;
let frame_42 = sequence.get_frame(42)?;       // cached on access
let interpolated = sequence.interpolate(42.5)?; // sub-frame interpolation

Or drive the whole thing from the CLI:

# Explore the expanded toolset
cargo run -p oxigaf-cli -- --help

# Train, then export the result to glTF
oxigaf train  --config experiment.toml
oxigaf export --format gltf --input checkpoint.safetensors --output avatar.gltf

What’s New in 0.1.1

Rigging & expressions — AvatarRig, GazeController, HeadTracker; FACS AU coefficients, expression transfer/clustering, emotion recognition, and phoneme-driven animation.
Mesh & geometry suite — repair, smoothing, Loop/Catmull-Clark subdivision, morphing, geodesic distance, spectral analysis, statistical shape models, symmetry detection.
UV & texture pipeline — UV parameterisation, texture baking, face atlas, albedo maps, SH lighting.
Render pipeline — SSAO, bloom, DoF, motion blur, HDR tone mapping, color grading, TAA, film grain, chromatic aberration, lens distortion; volumetric rendering, render graph, stereo output, BVH/LOD acceleration.
Diffusion suite — DDPM/adaptive/consistency/flow-matching samplers, guidance rescaling, LoRA & ControlNet adapters, KV cache, fused attention, SDEdit-style editing, DDIM inversion, CLIP scoring.
Training regimes — curriculum learning, progressive training, few-shot adaptation, MAML meta-learning, continual learning; gradient surgery, OHEM, EMA, SWA, mixed precision.
CLI toolset — PLY/glTF/mesh/point-cloud/video/animation export; scene analyser, model inspector, diff tool, quality checker; scene merging/streaming, live dashboard, HTML experiment reports.
Unified pipeline — new pipeline module plus end_to_end_pipeline, checkpoint_lifecycle, and custom_loss examples.
Dependency refresh — nalgebra 0.35, glam 0.33, candle 0.10, safetensors 0.8, wgpu 29, ToRSh bridge 0.1.2, oxiarc-archive 0.3.3, plus kiddo 5.

Tips

cargo add oxigaf gives you everything. The meta-crate re-exports all sub-crates and ships a prelude; reach for oxigaf_diffusion, oxigaf_render, etc. directly only when you want a thinner dependency.
Use the new pipeline module instead of wiring stages by hand. It orchestrates training → rendering → export; the end_to_end_pipeline example is the fastest way to see the full flow.
Keep --all-features off on macOS. It tries to enable cuda (Linux/Windows only). On Apple Silicon build with --features "metal,simd,parallel,flash_attention"; on a CUDA box swap metal for cuda.
Drive expressions with phoneme-based animation for lip-sync from audio, or feed FACS AU coefficients directly when you already have action-unit tracks.
Reach for the new samplers for speed. Consistency models and flow matching cut inference steps dramatically versus plain DDPM when you need previews.
Adapt to a new identity without full retraining via the few-shot / LoRA adapter paths, then fine-tune with curriculum learning for quality.

This is the foundation

OxiGAF 0.1.1 slots into a COOLJAPAN ecosystem that has grown up around it. It pairs naturally with oxihuman (parametric body generation) for a full body-plus-photoreal-face digital human, pipes reconstructed avatars through oximedia for real-time video, leans on ToRSh for tensor interop via oxigaf-bridge, and stays Pure Rust on the same foundations as SciRS2, OxiBLAS, and OxiFFT.

Repository: https://github.com/cool-japan/oxigaf

Star the repo if you want digital humans that are fast, safe, and truly sovereign.

Pure Rust Gaussian avatars are here — and now they come with a studio.

— KitaSan at COOLJAPAN OÜ June 19, 2026