An archiver earns trust one corner case at a time. Today we closed a sharp one.
Today we released OxiArc 0.3.3 — a correctness release in which Brotli now round-trips high-entropy and incompressible data byte-for-byte across every quality level from 1 to 11.
No C. No Fortran. No zlib. No libarchive. No external shared libraries. No FFI overhead. No build hell. Just clean, memory-safe, blazing-fast archiving and compression that compiles to a single static binary, targets WASM, and runs everywhere.
OxiArc is the Pure Rust replacement for the zip, tar, gzip, zstd, and 7-zip tools — and for the Rust crates zip, flate2, zstd, bzip2, lz4, tar, snap, brotli, and miniz_oxide.
Why 0.3.3 matters
Compressors meet two kinds of data: the kind that shrinks, and the kind that does not. Already-compressed files, encrypted blobs, and truly random bytes do not shrink — and a serious codec must still round-trip them faithfully. Previously, oxiarc-brotli could fail to decode near-uniform or incompressible inputs. 0.3.3 fixes that. It is a real robustness win, not a new feature, and it is on-the-wire compatible — the fix is entirely internal, with no API change, so streams you already produced keep working.
Two underlying bugs were behind the failures:
- Incomplete length-limited Huffman codes. For near-uniform literal distributions where every symbol is present, the old
compute_code_lengthsheuristic — idealceil(-log2 p)lengths plus a Kraft-inequality fix-up loop — could emit a code-length table whose Kraft sum was strictly below 2^15. That is an incomplete prefix code: the decoder would eventually hit a bit pattern that decoded to no symbol at all (“invalid Huffman code: no matching code found”). - Insert lengths above 319 silently truncated. A single incompressible meta-block is one insert-and-copy command whose insert length spans the entire block. But the encoder only had insert-length categories 0–15 (max base 192, i.e. insert length ≤ 319), and it wrote the excess into a 7-bit field that wrapped around — desynchronising the decoder and producing a content mismatch.
A focused technical note on the fixes
The repairs go to the root of each bug rather than papering over the symptoms:
compute_code_lengthsnow uses the package-merge algorithm (Larmore–Hirschberg) instead of theceil(-log2 p)heuristic. Package-merge always produces a complete and length-optimal (minimum-redundancy) length-limited code. The change addspackage_merge_lengthsandis_complete_code, the latter a Kraft-sum invariant checked in a debug assertion.- The insert-length code table is unified into a single source of truth,
insert_length_code_info(cat) -> (base, extra_bits), shared by encoder and decoder. Insert categories are extended from 15 up toMAX_INSERT_LENGTH_CATEGORY = 40— covering inserts up to roughly 4 MiB — via the extended insert-and-copy symbols 128–703. The decoder’s previously split short and extended functions collapse into a singledecode_insert_lengthdriven by that shared table, so encoder and decoder agree across the full insert-length range by construction. - A new
high_entropy_roundtrip.rsregression suite locks the guarantee in. It asserts byte-for-byte round-trips across quality levels 1–11 for random 4 KiB and 64 KiB data, an incompressible counter sequence, all-distinct-byte and all-same-byte blocks, the empty input, varied sizes, and mixed compressible / incompressible content — plus adecode_insert_length↔insert_length_code_infoinverse-check unit test over categories 0–40.
The result: 1,679 tests passing, 2 skipped (13 new), zero Clippy warnings under -D warnings, zero rustdoc warnings, all COOLJAPAN policies compliant.
Getting Started
CLI:
cargo install oxiarc-cli
Library:
cargo add oxiarc-brotli
The fix needs nothing from you — it is internal. Existing OxiArc code keeps working unchanged. Here is the basic library shape, compressing a buffer and walking a ZIP archive:
use oxiarc_deflate::{deflate, inflate};
use oxiarc_archive::ZipReader;
use std::fs::File;
let compressed = deflate(b"Hello, World!", 6)?;
let decompressed = inflate(&compressed)?;
let file = File::open("archive.zip")?;
let mut zip = ZipReader::new(file)?;
for entry in zip.entries() {
println!("{}: {} bytes", entry.name, entry.size);
}
On the Brotli side, you compress at a quality level from 0 to 11 just as before — the new code generator and extended insert-length table simply make every level correct on every input, including data that does not compress.
What’s New in 0.3.3
- Brotli now round-trips high-entropy and incompressible data byte-for-byte across all quality levels 1–11; previously, near-uniform or incompressible inputs could fail to decode.
- The Huffman code-length generator switched to package-merge (Larmore–Hirschberg), always producing complete, minimum-redundancy length-limited codes, with new
package_merge_lengthsandis_complete_codehelpers. - A unified
insert_length_code_infotable shared by encoder and decoder, with insert categories extended up toMAX_INSERT_LENGTH_CATEGORY = 40(inserts up to ~4 MiB) and a singledecode_insert_lengthpath. - A new
high_entropy_roundtrip.rsregression suite plus an insert-length inverse-check test over categories 0–40. - The fix is internal and on-the-wire compatible — no API change.
- 1,679 tests passing, 2 skipped (13 new), zero Clippy and rustdoc warnings, all policies compliant.
Tips
- If you produce or store Brotli streams of compressed, encrypted, or random data, upgrade to 0.3.3 — earlier versions could fail to decode incompressible inputs.
- No migration work is required. The fix is internal and on-the-wire compatible, so existing streams and existing code keep working with no API change.
- Mix freely: Brotli now safely handles mixed compressible and incompressible content within a single stream.
- Rely on completeness at every level: the new package-merge code generator guarantees complete prefix codes for quality levels 1 through 11.
- Trust the guarantee going forward: the
high_entropy_roundtrip.rssuite pins byte-for-byte round-trips so this class of bug cannot quietly return.
This is the foundation
This is the kind of correctness hardening you expect from an archiver the whole stack leans on. Incompressible payloads are routine: pre-compressed media in OxiMedia, encrypted attachments in RusMES, already-packed scientific arrays in SciRS2 / NumRS2, opaque geospatial tiles in OxiGDAL, and mixed-content ingestion in ToRSh / OxiRAG. Every one of them needs a codec that round-trips data that does not shrink. With 0.3.3, OxiArc’s Brotli does — byte-for-byte, at every quality level, in Pure Rust.
Repository: https://github.com/cool-japan/oxiarc
Star the repo if you want archiving and compression you can trust on every kind of input, without the traditional native toolchain headaches.
The era of “just use zlib” or “link libarchive” is coming to an end.
— KitaSan at COOLJAPAN OÜ June 6, 2026