22 new transformer architectures land in one patch — the catalog grows from 27+ to 49+, all reachable through a single AutoModel::from_pretrained call, all Pure Rust.
Today we released TrustformeRS 0.1.1 — a focused patch that adds 22 new architectures (49+ total) and deepens the Pure Rust supply chain by swapping the ONNX Runtime for oxionnx and tar for oxiarc-archive.
TrustformeRS is the Pure Rust implementation of Hugging Face Transformers: transformer and LLM loading and inference, tokenizers, and HuggingFace model hub access — with no Python and no PyTorch. The 0.1.0 first stable release established that foundation; 0.1.1 builds directly on it, widening model coverage while tightening the boundary against C and C++ runtimes.
No PyTorch. No Python. No ONNX Runtime. No librdkafka by default. The 0.1.0 release already cut PyTorch and Python out of the inference loop; 0.1.1 finishes the job at the edges. The ONNX import/export path no longer links the onnxruntime C++ library — it runs through oxionnx, Pure Rust end to end. Archive handling no longer needs libtar — oxiarc-archive replaces the tar crate. And the librdkafka C dependency that the Kafka backend pulled in is now feature-gated, so the default build drops it entirely. What remains is a single static binary you can compile for native targets or WASM, with nothing for ldd to resolve.
Why 0.1.1 matters
This patch moves on two axes at once: more model coverage and a stricter Pure Rust boundary.
On coverage, the AutoModel router now resolves 49+ architectures — up from 27+ — spanning modern LLMs, state-space and linear-attention models, code models, speech, and diffusion. On the boundary, three C/C++ incumbents are displaced or sidelined: the ONNX path no longer pulls a C++ runtime, archive handling no longer needs libtar, and Kafka’s C dependency is now opt-in rather than default.
Two maturity signals come with it: 88 clippy unused-import warnings eliminated, and version consistency restored across every workspace crate. These are the unglamorous things that make a 0.1.x line trustworthy to build on.
Technical Deep Dive
(a) AutoModel routing resolves 22 new architectures. The same AutoModel::from_pretrained entry point now maps 22 additional architectures onto their implementations, with no new per-model API to learn. Grouped by what they bring:
- Modern LLMs: LLaMA3.2, Qwen2.5, Phi4, Gemma2, Granite, Nemotron, Yi.
- State-space & linear-attention (long-context): Mamba2, S4, Hyena, RetNet, Linformer, Performer, xLSTM.
- Code: StarCoder2.
- Speech: Whisper.
- Diffusion: SD3.
Rounding out the 22: Falcon2, InternLM2, Jamba, Jamba2, and StableLM. Because routing happens behind AutoModel, every one of these is usable through the exact same load-and-forward flow as a BERT checkpoint.
(b) Pure Rust ONNX via oxionnx. ONNX export and import previously depended on the onnxruntime C++ library. In 0.1.1 that path is served by oxionnx, the Pure Rust ONNX implementation (shipped 2026-03-26). Cross-platform builds no longer carry the C++ runtime, and the export/import behavior stays available without it.
(c) Supply-chain hardening. The tar crate is replaced by oxiarc-archive (COOLJAPAN policy), so archive extraction is Pure Rust. The rdkafka Kafka backend is feature-gated behind --features kafka, removing librdkafka from the default build. SciRS2 dependencies are upgraded to 0.4.2 (scirs2-core and scirs2-linalg), and supporting deps move forward: oxiarc-deflate/oxiarc-lz4 0.2.7, wasm-bindgen 0.2.118, web-sys 0.3.95, lapin 4.5, redis 1.2.
(d) Maintainability. Seven oversized source files were split with splitrs (shipped 2026-04-25) to keep every file under the COOLJAPAN 2000-line policy — the same tool used to keep the rest of the ecosystem tidy.
Getting Started
Add the crate:
cargo add trustformers
Load one of the new architectures — here a Whisper checkpoint — through the same AutoModel/AutoTokenizer flow:
use trustformers::{AutoModel, AutoTokenizer};
// 0.1.1 routes new architectures (Whisper / Qwen2.5 / Mamba2 / ...)
// through the same from_pretrained entry point.
let tokenizer = AutoTokenizer::from_pretrained("openai/whisper-base")?;
let model = AutoModel::from_pretrained("openai/whisper-base")?;
let inputs = tokenizer.encode("Hello, Rust world!", None)?;
let outputs = model.forward(&inputs)?;
The API is unchanged from 0.1.0 — swap the HuggingFace id for Qwen/Qwen2.5-7B, a Mamba2 checkpoint, or any of the 49+ supported architectures and the rest of the code stays the same.
What’s New in 0.1.1
Added
- 49+ transformer architectures, 22 of them new in this release: Falcon2, Gemma2, Granite, Hyena, InternLM2, Jamba, Jamba2, Linformer, LLaMA3.2, Mamba2, Nemotron, Performer, Phi4, Qwen2.5, RetNet, S4, SD3, StableLM, StarCoder2, Whisper, xLSTM, and Yi.
- Natural typing simulator for human-like response delivery.
- Ensemble model types and strategies.
- Resource analysis and monitoring structures.
Changed
- Replaced ONNX Runtime with oxionnx — the ONNX import/export path is now Pure Rust, with no
onnxruntimeC++ dependency. - Replaced the
tarcrate with oxiarc-archive — archive handling is now Pure Rust. - Feature-gated the
rdkafkaKafka backend (--features kafka) — the default build drops thelibrdkafkaC dependency. - Upgraded SciRS2 dependencies to 0.4.2.
- Split 7 oversized files with splitrs (COOLJAPAN 2000-line policy).
- Dependency upgrades:
oxiarc-deflate/oxiarc-lz40.2.7,scirs2-core/scirs2-linalg0.4.2,wasm-bindgen0.2.118,web-sys0.3.95,lapin4.5,redis1.2.
Fixed
- Version consistency across all workspace crates.
- Example crates missing
publish = false. cargo fmtformatting across 4 files.- 88 clippy unused-import warnings eliminated.
Tips
-
One call, any architecture. Load any of the 49+ models through the same
AutoModel::from_pretrained— there is no per-model API, so adding a new checkpoint to your code is a one-line change. -
Leaner default build. Kafka is now opt-in. Only add
--features kafkaif you actually publish to Kafka; otherwise you ship without thelibrdkafkaC dependency. -
Pure Rust ONNX. The ONNX path runs on
oxionnx, so cross-platform and CI builds no longer need theonnxruntimeC++ runtime —cargo buildis all it takes. -
Reach for long-context models. Try Mamba2, S4, or Hyena when you need O(N)-style sequence handling instead of quadratic attention — useful for very long inputs.
-
Speech in the same stack. Whisper support brings speech-to-text into TrustformeRS — load it exactly like a text model:
let model = AutoModel::from_pretrained("openai/whisper-small")?; -
Code and diffusion too. StarCoder2 covers code, and SD3 brings a diffusion architecture under the same router.
This is the foundation
TrustformeRS 0.1.1 fits the COOLJAPAN ecosystem as of late April 2026: built on SciRS2 0.4.2 plus OxiBLAS, Oxicode, and OxiARC for its numerical and supply-chain layers, with oxionnx now powering the ONNX path. It pairs naturally with OxiCUDA (the Pure-Rust CUDA-toolkit replacement, shipped 2026-04-13) when you want GPU compute, and it sits beside OxiLLaMa, ToRSh, SkleaRS, and TenfloweRS in the COOLJAPAN ML stack. The codebase is kept under the 2000-line policy with SplitRS.
Repository: https://github.com/cool-japan/trustformers
Star the repo if a Pure Rust transformer stack — 49 architectures, no Python, no C++ runtime — is the kind of foundation you want to build on. Sovereign inference, all the way down.
— KitaSan at COOLJAPAN OÜ April 27, 2026