TrustformeRS 0.1.1 Released — 49+ Transformer Architectures, Pure Rust ONNX, No Python

22 new transformer architectures land in one patch — the catalog grows from 27+ to 49+, all reachable through a single AutoModel::from_pretrained call, all Pure Rust.

Today we released TrustformeRS 0.1.1 — a focused patch that adds 22 new architectures (49+ total) and deepens the Pure Rust supply chain by swapping the ONNX Runtime for oxionnx and tar for oxiarc-archive.

TrustformeRS is the Pure Rust implementation of Hugging Face Transformers: transformer and LLM loading and inference, tokenizers, and HuggingFace model hub access — with no Python and no PyTorch. The 0.1.0 first stable release established that foundation; 0.1.1 builds directly on it, widening model coverage while tightening the boundary against C and C++ runtimes.

No PyTorch. No Python. No ONNX Runtime. No librdkafka by default. The 0.1.0 release already cut PyTorch and Python out of the inference loop; 0.1.1 finishes the job at the edges. The ONNX import/export path no longer links the onnxruntime C++ library — it runs through oxionnx, Pure Rust end to end. Archive handling no longer needs libtar — oxiarc-archive replaces the tar crate. And the librdkafka C dependency that the Kafka backend pulled in is now feature-gated, so the default build drops it entirely. What remains is a single static binary you can compile for native targets or WASM, with nothing for ldd to resolve.

Why 0.1.1 matters

This patch moves on two axes at once: more model coverage and a stricter Pure Rust boundary.

On coverage, the AutoModel router now resolves 49+ architectures — up from 27+ — spanning modern LLMs, state-space and linear-attention models, code models, speech, and diffusion. On the boundary, three C/C++ incumbents are displaced or sidelined: the ONNX path no longer pulls a C++ runtime, archive handling no longer needs libtar, and Kafka’s C dependency is now opt-in rather than default.

Two maturity signals come with it: 88 clippy unused-import warnings eliminated, and version consistency restored across every workspace crate. These are the unglamorous things that make a 0.1.x line trustworthy to build on.

Technical Deep Dive

(a) AutoModel routing resolves 22 new architectures. The same AutoModel::from_pretrained entry point now maps 22 additional architectures onto their implementations, with no new per-model API to learn. Grouped by what they bring:

Modern LLMs: LLaMA3.2, Qwen2.5, Phi4, Gemma2, Granite, Nemotron, Yi.
State-space & linear-attention (long-context): Mamba2, S4, Hyena, RetNet, Linformer, Performer, xLSTM.
Code: StarCoder2.
Speech: Whisper.
Diffusion: SD3.

Rounding out the 22: Falcon2, InternLM2, Jamba, Jamba2, and StableLM. Because routing happens behind AutoModel, every one of these is usable through the exact same load-and-forward flow as a BERT checkpoint.

(b) Pure Rust ONNX via oxionnx. ONNX export and import previously depended on the onnxruntime C++ library. In 0.1.1 that path is served by oxionnx, the Pure Rust ONNX implementation (shipped 2026-03-26). Cross-platform builds no longer carry the C++ runtime, and the export/import behavior stays available without it.

(c) Supply-chain hardening. The tar crate is replaced by oxiarc-archive (COOLJAPAN policy), so archive extraction is Pure Rust. The rdkafka Kafka backend is feature-gated behind --features kafka, removing librdkafka from the default build. SciRS2 dependencies are upgraded to 0.4.2 (scirs2-core and scirs2-linalg), and supporting deps move forward: oxiarc-deflate/oxiarc-lz4 0.2.7, wasm-bindgen 0.2.118, web-sys 0.3.95, lapin 4.5, redis 1.2.

(d) Maintainability. Seven oversized source files were split with splitrs (shipped 2026-04-25) to keep every file under the COOLJAPAN 2000-line policy — the same tool used to keep the rest of the ecosystem tidy.

Getting Started

Add the crate:

cargo add trustformers

Load one of the new architectures — here a Whisper checkpoint — through the same AutoModel/AutoTokenizer flow:

use trustformers::{AutoModel, AutoTokenizer};

// 0.1.1 routes new architectures (Whisper / Qwen2.5 / Mamba2 / ...)
// through the same from_pretrained entry point.
let tokenizer = AutoTokenizer::from_pretrained("openai/whisper-base")?;
let model = AutoModel::from_pretrained("openai/whisper-base")?;

let inputs = tokenizer.encode("Hello, Rust world!", None)?;
let outputs = model.forward(&inputs)?;

The API is unchanged from 0.1.0 — swap the HuggingFace id for Qwen/Qwen2.5-7B, a Mamba2 checkpoint, or any of the 49+ supported architectures and the rest of the code stays the same.

What’s New in 0.1.1

Added

49+ transformer architectures, 22 of them new in this release: Falcon2, Gemma2, Granite, Hyena, InternLM2, Jamba, Jamba2, Linformer, LLaMA3.2, Mamba2, Nemotron, Performer, Phi4, Qwen2.5, RetNet, S4, SD3, StableLM, StarCoder2, Whisper, xLSTM, and Yi.
Natural typing simulator for human-like response delivery.
Ensemble model types and strategies.
Resource analysis and monitoring structures.

Changed

Replaced ONNX Runtime with oxionnx — the ONNX import/export path is now Pure Rust, with no onnxruntime C++ dependency.
Replaced the tar crate with oxiarc-archive — archive handling is now Pure Rust.
Feature-gated the rdkafka Kafka backend (--features kafka) — the default build drops the librdkafka C dependency.
Upgraded SciRS2 dependencies to 0.4.2.
Split 7 oversized files with splitrs (COOLJAPAN 2000-line policy).
Dependency upgrades: oxiarc-deflate/oxiarc-lz4 0.2.7, scirs2-core/scirs2-linalg 0.4.2, wasm-bindgen 0.2.118, web-sys 0.3.95, lapin 4.5, redis 1.2.

Fixed

Version consistency across all workspace crates.
Example crates missing publish = false.
cargo fmt formatting across 4 files.
88 clippy unused-import warnings eliminated.

Tips

One call, any architecture. Load any of the 49+ models through the same AutoModel::from_pretrained — there is no per-model API, so adding a new checkpoint to your code is a one-line change.
Leaner default build. Kafka is now opt-in. Only add --features kafka if you actually publish to Kafka; otherwise you ship without the librdkafka C dependency.
Pure Rust ONNX. The ONNX path runs on oxionnx, so cross-platform and CI builds no longer need the onnxruntime C++ runtime — cargo build is all it takes.
Reach for long-context models. Try Mamba2, S4, or Hyena when you need O(N)-style sequence handling instead of quadratic attention — useful for very long inputs.
Speech in the same stack. Whisper support brings speech-to-text into TrustformeRS — load it exactly like a text model:
```
let model = AutoModel::from_pretrained("openai/whisper-small")?;
```
Code and diffusion too. StarCoder2 covers code, and SD3 brings a diffusion architecture under the same router.

This is the foundation

TrustformeRS 0.1.1 fits the COOLJAPAN ecosystem as of late April 2026: built on SciRS2 0.4.2 plus OxiBLAS, Oxicode, and OxiARC for its numerical and supply-chain layers, with oxionnx now powering the ONNX path. It pairs naturally with OxiCUDA (the Pure-Rust CUDA-toolkit replacement, shipped 2026-04-13) when you want GPU compute, and it sits beside OxiLLaMa, ToRSh, SkleaRS, and TenfloweRS in the COOLJAPAN ML stack. The codebase is kept under the 2000-line policy with SplitRS.

Repository: https://github.com/cool-japan/trustformers

Star the repo if a Pure Rust transformer stack — 49 architectures, no Python, no C++ runtime — is the kind of foundation you want to build on. Sovereign inference, all the way down.

— KitaSan at COOLJAPAN OÜ April 27, 2026