The data archiving and compression foundation of the COOLJAPAN ecosystem just opened a new chapter.
Today we released OxiArc 0.3.0 — a minor bump dedicated to one theme: squeezing more compression ratio out of every codec while keeping memory under control.
No C. No Fortran. No zlib. No libarchive. No external shared libraries.
No FFI overhead. No build hell.
Just clean, memory-safe, blazing-fast archiving and compression that compiles to a single static binary and runs everywhere, including WASM.
Why the 0.3 line matters
The 0.2 series made OxiArc broad and operationally polished. The 0.3 line makes it tight. Across the board, the codecs learned to either compress harder or use less memory — and in several cases, both:
- Optimal DEFLATE — a Zopfli-style, graph-based parser that produces smaller output than greedy or lazy matching.
- An LZMA BT4 match finder — bringing level-9 compression quality up to par with the LZMA SDK.
- True bounded-memory streaming — LZ4 now compresses and decompresses block-at-a-time, with an explicit memory budget.
- Parallel Snappy — rayon-based frame compression that stays fully format-compatible.
- Memory-mapped archive access — zero-copy reads of large archives.
OxiArc 0.3.0 is API-stable as of v0.3.0: ~70,968 lines of pure Rust across 227 files and 13 crates, with 1,640 tests passing in total (44 new this release, 3 skipped) and zero warnings.
What’s new
Optimal DEFLATE (oxiarc-deflate). A new OptimalParser implements Zopfli-style compression: an iterative shortest-path dynamic program over the match graph, with per-pass Huffman cost retraining so each iteration prices matches against the costs the previous pass discovered. It’s opt-in via Deflater::with_optimal_parsing(level) — you trade extra CPU for smaller output than greedy or lazy parsing. The work added cost_table_from_lengths, cost_of_match, and find_all_matches.
LZMA BT4 match finder (oxiarc-lzma). Level 9 now uses a binary-tree match finder, Bt4MatchFinder, with a 3-table hash (h2/h3/h4), a cyclic-buffer BST, and a configurable cut_value depth limit. A new MatchFinder trait abstracts the two strategies — HashChainMatchFinder for levels 0–8 and Bt4MatchFinder for level 9 — so level 9 now delivers compression quality matching the LZMA SDK.
True bounded-memory streaming (oxiarc-lz4). Lz4Compressor emits complete blocks on the fly instead of buffering the whole input, and Lz4Decompressor is now a state-machine parser that processes one block at a time. Both gain with_memory_budget(usize) so you can put a hard ceiling on peak memory regardless of input size.
Parallel Snappy (oxiarc-snappy). A new compress_parallel function — behind a parallel feature flag — performs rayon-based chunk-level parallelism mirroring the LZ4 parallel encoder. The output is fully compatible with the serial FrameDecoder, so parallelism is a pure speed knob with no format cost.
Optimal LZSS for LZH (oxiarc-lzhuf). The match finder swaps its old 3-byte hash for a 4-byte multiplicative hash (better avalanche, fewer collisions), and a new LzssOptimalParser brings two-pass optimal LZSS parsing with Huffman-cost retraining, exposed via LzhEncoder::with_optimal().
Memory-mapped access (oxiarc-core). A new MappedFile primitive provides read-only memory-mapped files (memmap2-backed, behind an mmap feature flag). It implements Deref<Target=[u8]> and AsRef<[u8]>, so large archives can be read with zero copies.
A behavioral change worth noting
The simd feature on oxiarc-core is now a deprecated no-op. SIMD is auto-enabled via cfg(target_arch) on x86_64 and aarch64, so there is nothing to opt into. Existing builds that pass features = ["simd"] still compile — the flag is kept as a backward-compatible alias and will be removed in a future minor. See the migration note in Tips below.
As always: zero warnings and full COOLJAPAN policy compliance.
Technical Deep Dive: optimal parsing and the BT4 tree
Greedy and lazy DEFLATE parsers make local decisions: take the longest match here, or peek one byte ahead. Optimal parsing reframes the whole job as a shortest-path problem. OptimalParser builds the graph of every viable match (find_all_matches), prices each edge with the current Huffman cost model (cost_of_match, cost_table_from_lengths), and solves for the cheapest path through the input — then retrains the cost model on that solution and runs the pass again. Each iteration’s prices reflect the symbol distribution the last iteration actually produced, which is exactly the Zopfli insight that wins those last few percent of ratio. The same two-pass, cost-retraining idea drives LzssOptimalParser in oxiarc-lzhuf.
LZMA’s quality ceiling, meanwhile, has always been about match-finder reach. The new Bt4MatchFinder maintains a binary search tree inside a cyclic buffer, indexed by a 3-table hash (h2/h3/h4) so short and long matches are found through the right hash width. The cut_value knob caps how deep the tree is walked, trading search effort for speed. Abstracting both finders behind a MatchFinder trait keeps the hash-chain path (levels 0–8) for fast modes while letting level 9 spend the extra cycles to match SDK-grade output.
On the memory side, the LZ4 rewrite is structural: by making the compressor emit complete blocks as it goes and the decompressor a one-block-at-a-time state machine, neither needs the full input resident. with_memory_budget then turns that property into a guarantee. Pair it with MappedFile for zero-copy reads, and you can stream a multi-gigabyte archive through a fixed, modest memory envelope.
Getting Started
Install the CLI:
cargo install oxiarc-cli
For the library, add the archive crate plus the codec you need. The basic DEFLATE + ZIP workflow is unchanged:
cargo add oxiarc-archive
cargo add oxiarc-deflate
use oxiarc_deflate::{deflate, inflate};
use oxiarc_archive::ZipReader;
use std::fs::File;
let compressed = deflate(b"Hello, World!", 6)?;
let decompressed = inflate(&compressed)?;
let file = File::open("archive.zip")?;
let mut zip = ZipReader::new(file)?;
for entry in zip.entries() {
println!("{}: {} bytes", entry.name, entry.size);
}
The new ratio and memory knobs are opt-in. Enable maximum-ratio DEFLATE conceptually with Deflater::with_optimal_parsing(level), and reach for the new feature flags when you want them:
cargo add oxiarc-core --features mmap
cargo add oxiarc-snappy --features parallel
What’s New in 0.3.0
- Zopfli-style optimal DEFLATE parser (
OptimalParser), opt-in viaDeflater::with_optimal_parsing(level). - LZMA BT4 match finder (
Bt4MatchFinder); level 9 now matches LZMA SDK compression quality. - True bounded-memory LZ4 streaming with
with_memory_budget(usize)on compressor and decompressor. - Parallel Snappy frame compression (
compress_parallel) behind theparallelfeature — output stays standard-compatible. - Optimal LZSS for LZH (
LzssOptimalParser,LzhEncoder::with_optimal()) plus a 4-byte multiplicative match hash. MappedFilezero-copy memory-mapped archive access behind themmapfeature.- The
simdfeature onoxiarc-coreis now a deprecated no-op (SIMD is automatic); 1,640 tests, zero warnings.
Tips
- Squeeze maximum ratio with
Deflater::with_optimal_parsing(9)— orLzhEncoder::with_optimal()for LZH — when output size matters more than CPU time. - Reach for LZMA level 9 now that the BT4 match finder brings it to LZMA SDK quality; it’s the right default when you’re optimizing purely for ratio.
- Bound peak memory on huge streams with
Lz4Compressor::with_memory_budget(...)(and the matching decompressor) to stream arbitrarily large input through a fixed envelope. - Turn on multicore Snappy with the
parallelfeature andcompress_parallel— the output still decodes with the standard serialFrameDecoder. - Read large archives with zero copies using
MappedFilebehind themmapfeature, ideal when paired with bounded-memory streaming. - Migration: drop any
features = ["simd"]from youroxiarc-coredependency. SIMD is now enabled automatically on x86_64 and aarch64, and the flag is a deprecated no-op kept only as a backward-compatible alias — it will be removed in a future minor.
This is the foundation
OxiArc is the sovereign archiving and compression backend for the wider COOLJAPAN stack, and tighter ratios plus disciplined memory directly benefit the projects that build on it:
- OxiMedia — video/image asset packaging and distribution
- SciRS2 / NumRS2 — dataset compression and long-term storage
- OxiGDAL — geospatial file archiving
- ToRSh / OxiRAG — high-throughput data ingestion pipelines
- RusMES — mail attachment handling
Repository: https://github.com/cool-japan/oxiarc
Star the repo if you want high-performance archiving and compression — now with optimal parsing, an SDK-grade LZMA, and bounded-memory streaming — and none of the traditional native toolchain headaches.
The era of “just use zlib” or “link libarchive” is coming to an end.
— KitaSan at COOLJAPAN OÜ May 17, 2026