Stable means stable — so the first thing after 0.1.0 is making the edges as trustworthy as the core.
Today we released sklears 0.1.1 — a correctness and stability patch that hardens clustering, streaming preprocessing, pipelines, and serialization across the pure-Rust scikit-learn surface.
sklears is the pure-Rust alternative to scikit-learn, and the sovereignty story is unchanged: No C. No Fortran. No Cython. No GIL. scikit-learn leans on a tower of compiled extensions; sklears is plain Rust on top of the SciRS2 stack, with OxiBLAS for BLAS/LAPACK and Oxicode for serialization. This release does not move that line — it polishes what already sits behind it.
Why 0.1.1 matters
0.1.0 was a big release: 36 crates, type-safe Untrained → Trained state machines, builder APIs, SIMD via std::simd, Rayon work-stealing parallelism, and >99% scikit-learn API coverage end-to-end against scikit-learn’s ~v1.5 feature set. Shipping that much surface in one stable cut means the next job is the hardening pass — finding the corners where behavior was subtly wrong and nailing them down. That is exactly what 0.1.1 is, and the test suite that guards it still stands at 11,586+ tests passing across 36 crates with coverage held at >99%.
Here is what got fixed, and why each one matters:
- HDBSCAN cluster persistence. Root-node detection and the order in which cluster-persistence values propagated were corrected. Persistence is what HDBSCAN uses to decide which clusters survive condensation, so getting the root and the propagation order right is the difference between trustworthy density clustering and quietly wrong labels.
- Streaming
Defaultdrift.StreamingStandardScalerandStreamingSimpleImputerhad hand-writtenDefaultimpls; they now use#[derive(Default)]. Manual defaults drift from the field defaults over time — deriving them removes a whole class of “the constructed value didn’t match what I declared” bugs. - Pipeline mutable access.
Pipeline::get_step_muthad lifetime elision that didn’t borrow-check cleanly fordyn PipelineStep. Fixing the elision makes mutable access to a fitted step ergonomic instead of a fight with the borrow checker. - Deterministic spectral graph clustering.
SpectralGraphConfigwas missing arandom_seedfield that its tests assumed; adding it makes spectral graph clustering reproducible run to run. - Serialization and GPU-accel field fixes. Arrow
StringArraycollection from anOption<&str>iterator was corrected, and a struct field-name mismatch inhardware_acceleration.rswas fixed so theGpuAccelerationpath compiles against the right names.
None of this adds an algorithm. All of it makes the algorithms you already had behave the way the docs promised.
What’s New in 0.1.1
In plain language, this is a bug-fix release plus the version bump — no new estimators:
- Clustering: HDBSCAN persistence extraction now detects the root correctly and propagates persistence in the right order.
- Streaming preprocessing:
StreamingStandardScalerandStreamingSimpleImputeruse derivedDefault;StreamingSimpleImputeruses the?operator forOptionearly-return. - Pipelines:
get_step_mutborrow-checks correctly fordyn PipelineStep. - Graph clustering:
SpectralGraphConfiggainedrandom_seedfor deterministic results. - Serialization: Arrow
StringArraycollection from anOption<&str>iterator is fixed. - GPU acceleration: field-name mismatch in
hardware_acceleration.rsresolved. - Dependencies: SciRS2 crates bumped to 0.4.2 (with oxicode 0.2 and oxifft 0.3.0 underneath).
Getting Started
Install:
cargo add sklears
The example below exercises the two areas 0.1.1 hardened — density clustering (the HDBSCAN/persistence fix) and the pipeline path:
use sklears::prelude::*;
use sklears::cluster::HDBSCAN;
fn main() -> Result<()> {
// Load a small dataset
let dataset = sklears::dataset::make_blobs(300, 2, 3, 0.6)?;
// 0.1.1 fixes HDBSCAN cluster-persistence extraction (root detection + ordering)
let labels = HDBSCAN::new()
.min_cluster_size(10)
.fit_predict(&dataset.data)?;
let n_clusters = labels.iter().filter(|&&l| l >= 0).max().map_or(0, |m| m + 1);
println!("found {} clusters", n_clusters);
Ok(())
}
The same release also smooths the preprocessing-into-model flow — a Pipeline chaining a StandardScaler into a LinearRegression, with get_step_mut now available to reach back into a fitted step:
use sklears::prelude::*;
use sklears::pipeline::Pipeline;
let mut pipe = Pipeline::new()
.add_step("scale", StandardScaler::new())
.add_step("model", LinearRegression::new());
pipe.fit(&x, &y)?;
// 0.1.1 fixes get_step_mut lifetime elision for `dyn PipelineStep`
if let Some(step) = pipe.get_step_mut("scale") {
// mutate the fitted step in place
}
Tips
- Trust density clustering again. With persistence extraction fixed,
HDBSCAN::new().min_cluster_size(...).fit_predict(...)is the go-to for data where you do not know the cluster count up front. Tunemin_cluster_sizeto control how aggressively small groups are merged into noise. - Make graph clustering reproducible. Set
random_seedonSpectralGraphConfig(and other spectral/graph clustering configs) so reruns and CI produce identical labels. - Go out-of-core with streaming estimators. Use
StreamingStandardScalerandStreamingSimpleImputerwhen the data does not fit in memory — now that their defaults are derived, the constructed state matches what you declared. - Reach into fitted pipelines.
Pipeline::get_step_mut("name")lets you adjust a step afterfitwithout rebuilding the whole pipeline. - Pin to the SciRS2 0.4.2 line. 0.1.1 tracks scirs2-* 0.4.2 (oxicode 0.2, oxifft 0.3.0); pin to it so your numerical backbone matches what these tests were validated against.
- Enable only the feature flags you need. Keep build times and binary size down by turning on just the estimator and backend features your project actually uses.
A maturing foundation
As of 2026-04-27, sklears sits squarely inside the COOLJAPAN ecosystem: built on SciRS2 0.4.2 with OxiBLAS for linear algebra and Oxicode for serialization, drawing on NumRS2 and PandRS for array and dataframe work. sklears covers classical machine learning, alongside TenfloweRS and TrustformeRS for deep learning — one pure-Rust stack from data wrangling through models. 0.1.1 is the kind of quiet release that makes the rest of that stack worth standing on.
Repository: https://github.com/cool-japan/sklears
Star the repo if a pure-Rust, no-GIL scikit-learn is something you want to build on — every star helps the ecosystem grow. Thanks for reading, and happy clustering.
— KitaSan at COOLJAPAN OÜ April 27, 2026