Most “language models” never touch a waveform. Kizzasi predicts the continuous world instead.
Today we released Kizzasi 0.1.0 — the first cut of a Rust-native autoregressive predictor for continuous signal streams: audio samples, sensor readings, robotics control loops, and video frames.
No Python runtime. No PyTorch. No CUDA toolkit pinned to a driver version. Kizzasi is written in Rust, compiles to a single static binary (or WASM), and runs the same on a workstation, an edge gateway, or a robot’s onboard controller. The signal stays in-process, the dependency tree stays auditable, and there is no C/C++/Fortran in the default build.
Why 0.1.0 matters
The phrase “Large Language Model” is, frankly, a misnomer. What these systems actually are is general-purpose signal predictors — they take a history and predict the next value. Text tokens are just one kind of signal (discrete vocabulary indices). Audio is a 44.1 kHz waveform. Sensor telemetry is a multivariate time series. Video is a high-dimensional spatio-temporal stream. They can all be served by the same autoregressive idea: predict the next value(s) from history.
Kizzasi (兆し — Japanese for “sign / omen / premonition”) is built around that insight, the AGSP (Autoregressive General-Purpose Signal Predictor) paradigm. And because it targets the physical world, raw next-value prediction is not enough — a predicted joint torque or motor velocity has to respect actual physical and safety limits. So Kizzasi is neuro-symbolic: it pairs the learning capacity of State Space Models with hard constraints, so predictions:
- follow statistical likelihoods learned from data,
- adhere to physical laws (conservation, causality),
- respect safety constraints (bounds, rate limits), and
- satisfy domain-specific logical rules.
For a 0.1.0 this is an honest first release — early, but already broad and structured. The workspace is roughly 25,000 lines of Rust across eight crates, every layer present, no unwrap() in production code, and a property-tested suite underneath.
Technical Deep Dive: the layers
Kizzasi is a workspace of focused crates, with a single facade (kizzasi) re-exporting them through an ergonomic prelude.
kizzasi-core— the SSM engine. A selective State Space Model implementation with a parallel scan (O(log N) depth), discretization caching, workspace pooling, cache-aligned data structures, and SIMD-optimized embeddings. CUDA/Metal acceleration is feature-gated; the default path is pure CPU.kizzasi-model— the architectures. Mamba and Mamba2, RWKV, S4/S4D diagonal state-space models, and a Transformer baseline, all behind oneModelTypefactory with HuggingFace-compatible weight loading. SSMs give you O(1) per-step inference and unbounded context; the Transformer is there for comparison.kizzasi-tokenizer— signal to tokens. VQ-VAE and residual VQ-VAE, μ-law compression, linear/adaptive/deadzone quantizers, and multi-scale temporal tokenization — plus domain tokenizers for music and environmental audio. For floating-point work, a continuous tokenizer keeps signals in their native domain.kizzasi-inference— the pipeline. Streaming inference with temperature / top-k / top-p sampling, dynamic batching, and memory-efficient state management.kizzasi-logic— constraint enforcement. The symbolic half: linear and nonlinear constraint projection, gradient projection, ADMM, Lagrangian relaxation for soft constraints, and Linear Temporal Logic (LTL) support — usable both as runtime guardrails and as differentiable training losses.kizzasi-io— the physical world. MQTT for IoT, real-time audio via CPAL, WebSocket and serial transports, WAV/CSV/HDF5 file I/O, and a DSP toolbox (FFT, filtering, resampling, beamforming/DOA, Hilbert-Huang/EMD, source separation, quality metrics).
Getting Started
Add the crate:
cargo add kizzasi
A minimal next-step predictor:
use kizzasi::prelude::*;
fn main() -> KizzasiResult<()> {
// Configure a predictor with a Mamba2 backend.
let config = KizzasiConfig::new()
.model_type(ModelType::Mamba2)
.input_dim(3)
.output_dim(3)
.hidden_dim(256)
.state_dim(16)
.num_layers(4)
.context_window(8192);
let mut predictor = Kizzasi::new(config)?;
// Single-step prediction — O(1) per step.
let input = array![0.1, 0.2, 0.3];
let output = predictor.step(&input)?;
println!("Predicted: {:?}", output);
Ok(())
}
And the neuro-symbolic part — predictions that are guaranteed to stay inside your limits:
use kizzasi::prelude::*;
let mut predictor = Kizzasi::new(
KizzasiConfig::new().model_type(ModelType::Rwkv).input_dim(3).output_dim(3),
)?;
let guardrails = GuardrailSet::new()
.add(Guardrail::new(
ConstraintBuilder::new()
.name("velocity_limit")
.bound(0, BoundType::Range(-1.0, 1.0)) // clamp channel 0 to [-1, 1]
.build()?,
))
.add(Guardrail::new(
ConstraintBuilder::new()
.name("rate_limit")
.rate_limit(0.1) // max change per step
.build()?,
));
predictor.set_guardrails(guardrails);
// Every prediction now satisfies the constraints by construction.
let safe_output = predictor.step(&array![0.5, 0.5, 0.5])?;
What’s inside
- Core SSM engine (
kizzasi-core): selective state-space model with parallel scan, discretization caching, workspace pooling, ILP ops, cache-aligned structures, and SIMD embeddings; feature-gated CUDA/Metal. - Model zoo (
kizzasi-model): Mamba, Mamba2, RWKV, S4/S4D, Transformer, behind a unified factory with HuggingFace-compatible weight loading. - Tokenizers (
kizzasi-tokenizer): VQ-VAE and residual VQ-VAE, μ-law codec, linear/adaptive/deadzone quantizers, multi-scale and domain-specific tokenizers. - Inference (
kizzasi-inference): streaming with temperature/top-k/top-p sampling, dynamic batching, multi-modal input, memory-efficient state. - Constraints (
kizzasi-logic): linear/nonlinear projection, gradient projection, ADMM, Lagrangian relaxation, LTL formulas, sliding-window checkers. - World I/O (
kizzasi-io): MQTT, audio (CPAL), WebSocket, serial, WAV/CSV/HDF5, FFT/filter/resample, beamforming/DOA, EMD/EEMD, PESQ/STOI/POLQA, FastICA/NMF/PCA. Platform-specific ROS2 support on Linux. - Facade (
kizzasi): a prelude and ergonomic API that re-exports every sub-crate.
Tips
- Pick the architecture for the job.
ModelType::Mamba2is the balanced default;ModelType::Rwkvis the lightweight choice for embedded and audio;ModelType::S4D(HiPPO-initialized) shines on smooth dynamics;ModelType::Transformeris your O(L)-per-step baseline. - Use presets to skip boilerplate.
KizzasiConfig::audio_preset(),::robotics_preset(), and::sensor_preset()give you sensible defaults for common modalities, e.g.KizzasiConfig::audio_preset().sample_rate(44100.0). - Look ahead with
predict_n. Beyond single-stepstep, callpredictor.predict_n(&initial, 100)to roll the model forward and get a 100-step trajectory. - Anomaly detection comes for free. Train on “normal” data, then treat a large prediction error at runtime as the anomaly score —
(prediction - actual).mapv(f64::abs).sum(). - Constraints are also losses. The same guardrails you enforce at inference can be folded into training via
ConstraintAwareLossorLagrangianRelaxationinkizzasi-logic, so the model learns to stay in-bounds rather than being clamped after the fact. - Trim the build with features. Default is
["std", "full"]; for a lean core usedefault-features = false, features = ["std"]and addio,logic,mqtt, oraudioonly as needed.
This is the foundation
Kizzasi 0.1.0 stands on the COOLJAPAN ecosystem. Its array math, signal processing, and FFTs come from SciRS2 (scirs2-core, scirs2-signal, scirs2-fft), and its constraint layer is built on TensorLogic — the neuro-symbolic engine that gives the “logic” in neuro-symbolic real teeth. Serialization runs through Oxicode. Together with siblings like VoiRS and NumRS2, these form a growing Pure-Rust stack for scientific and signal computing.
Repository: https://github.com/cool-japan/kizzasi
This is a 0.1.0 — early, but real, and built to grow. Star the repo if a sovereign, neuro-symbolic signal predictor is something you’d reach for. Pure Rust signal prediction is here — fast, safe, and sovereign.
— KitaSan at COOLJAPAN OÜ January 19, 2026