The “patch” that quietly turned AmateRS into a horizontally-sharded, fully-observable distributed database.
Today we released AmateRS 0.2.2 — a patch on paper, but the largest real changelog of the 0.2.x line: horizontal sharding is now live in the public API, distributed traces flow across nodes via OpenTelemetry, and the write path can ride a kernel-bypass io_uring WAL on Linux.
No C. No Fortran. No plaintext leaving your control. While the FHE incumbents lean on TFHE-rs, Microsoft SEAL, and OpenFHE — and the database incumbents still assume a server that can read your data — AmateRS keeps computing in the dark, inside a single static binary, 100% Pure Rust, Apache-2.0. Like Amaterasu retreating into the heavenly rock cave while her light still shines, your data stays inside its cryptographic shell while the queries keep running. Serialization rides oxicode (no bincode), and the new typed cluster log uses postcard for compact, no_std-friendly ClusterCommand encoding on the Raft path.
Why AmateRS 0.2.2 is a game changer
This release crosses the line from “single-node FHE database with a consensus foundation” to “distributed FHE database you can actually shard.”
- Horizontal sharding is now in the public API.
shard.rsandpartitioner.rs(~1,805 lines that previously compiled but were never declared inlib.rs) are now public, bringing consistent-hashing, range, and hash partitioning, plus aQueryRouterto fan requests out and a k-wayResultMergerto fan them back in. - A placement scheduler that thinks for itself. A stateless
PlacementCoordinatorturns aShardRegistrysnapshot into a deterministicPlacementPlan— detecting hot shards to split, cold adjacent shards to merge, and imbalance to rebalance — and a backgroundPlacementSchedulerproposes thosePlacementActions as Raft log entries from the leader. - End-to-end distributed tracing. With the
telemetryfeature on, AmateRS exports OTLP over gRPC and propagates W3CTraceContextthrough gRPC metadata, so a single trace follows a request across nodes. - A kernel-bypass io_uring WAL. On Linux, the
io-uringfeature swaps inUringWalWriterfor a high-throughput write-ahead-log path on a dedicatedtokio_uringruntime. - An FHE circuit cache. Repeated identical predicates no longer recompile — a blake3-keyed LRU
CircuitCacheshort-circuits compilation at the FILTER and UPDATE sites. - Constant-time API-key auth. Raw-key validation is now a constant-time scan, closing a timing side-channel oracle.
All of this lands green: 2,224 tests run, 2,224 passed, 29 skipped, 0 failed across the workspace — including 10 chaos scenarios in the cluster crate and 116 pytest cases in the Python SDK.
Technical Deep Dive
(a) Sharding & placement (Ukehi)
The sharding machinery was already written — it just wasn’t switched on. In 0.2.2, shard.rs and partitioner.rs become part of the public surface, exposing consistent-hashing / range / hash partitioning, the QueryRouter, and a k-way ResultMerger. A ShardRegistry tracks the topology, and the shard lifecycle is modeled explicitly as ShardSplit, ShardMerge, and ShardTransfer.
Driving that lifecycle is placement.rs: a stateless PlacementCoordinator that consumes a ShardRegistry snapshot and emits a deterministic PlacementPlan — split detection for hot shards (is_hot), merge detection for cold adjacent shards on the same node, and imbalance detection that proposes rebalance transfers. It is a pure function: no I/O, fully testable, fully reproducible.
The active half lives in placement_scheduler.rs. The background async PlacementScheduler runs on the Raft leader, takes the plan’s PlacementActions, and proposes them as ClusterCommand entries through Raft. Its PlacementSchedulerHandle stops the loop on drop, and you wire it in with attach_placement_scheduler().
That cluster log is now properly typed. cluster_command.rs defines ClusterCommand with 7 variants — DataPut, DataDelete, PlaceSplit, PlaceMerge, PlaceTransfer, MembershipAdd, MembershipRemove — serialized with postcard (compact, no_std-compatible), replacing the previous raw byte encoding. And because shard transfers can move a lot of state, large snapshots now stream in configurable chunks (snapshot_chunk_threshold_bytes, snapshot_chunk_size_bytes) via SnapshotStreamer / SnapshotReceiver, with per-follower streamers tracked and auto-cleaned; small snapshots still ship single-shot.
(b) Observability: OpenTelemetry
Tracing is now first-class. Behind the telemetry feature in amaters-core, telemetry.rs gives you TelemetryConfig and a TelemetryGuard: an OTLP gRPC exporter (opentelemetry-otlp), a batch SdkTracerProvider, and tracing-subscriber integration. The guard calls shutdown() on drop so in-flight spans are flushed cleanly.
To make traces span more than one node, amaters-net adds W3C TraceContext propagation for gRPC (otel_propagator.rs, also behind telemetry). TraceparentExtractor reads traceparent / tracestate from incoming gRPC metadata, inject_trace_context writes them onto outgoing calls, and TraceContextPropagatorLayer — a Tower Layer — wires it into the stack so you get end-to-end distributed traces across the cluster.
(c) Storage & compute perf
On Linux, the WAL can now bypass the kernel buffer cache. wal_uring.rs (feature io-uring in amaters-core) introduces UringWalWriter, which runs tokio_uring I/O on a dedicated OS thread with its own tokio_uring::start runtime and bridges to the async caller over an mpsc channel. UringWalConfig exposes ring_size, batch_size, direct_io, and channel_capacity, and the handle is Send + Sync + Clone so it can be shared without a mutex. Crucially, tokio-uring is now a cfg(target_os = "linux") conditional dependency, so macOS and Windows builds compile fine.
Secondary indexes also got smarter. An IndexExtractor trait plus an IndexedField type derive secondary-index entries straight from (Key, CipherBlob) — without parsing the ciphertext — and IndexManager::apply_extracted applies batched, diff-based updates. Both LsmTreeStorage and MemoryStorage auto-maintain those indexes via builder methods with_index_manager / with_index_extractor / register_index, so put, delete, and atomic_update now keep indexes consistent transparently under the update_lock — with zero overhead when no manager is attached.
For compute, the new FHE CircuitCache (circuit_cache.rs in amaters-net) is a thread-safe LRU (HashMap + VecDeque + parking_lot::Mutex, clonable Arc) for compiled FHE circuits, keyed by the blake3 hash of the predicate, defaulting to 256 entries. The FILTER and UPDATE predicate sites in server.rs call circuit_cache.get_or_compile(), so repeated same-predicate requests skip PredicateCompiler::compile entirely.
(d) Operability & safety
API-key validation no longer leaks timing. In auth.rs and middleware.rs, the old HashMap lookup on raw key strings is replaced with a constant-time linear scan via the constant_time_eq crate, killing the timing side-channel oracle on stored key characters. (The hashed path, hash_keys=true, was never affected.)
Operations get an alerting brain: alert_rules.rs adds a RuleEngine that evaluates AlertRules against AlertEvents, classifies them as AlertSeverity::Info / Warning / Critical, dedups within a window (rule name + dedup key), and fans FiredAlerts out to AlertSinks. An AlertSink trait plus a LogSink (emitting via tracing) ship in the box; each FiredAlert carries its rule_name, severity, event, and dedup_key.
Schema evolution gets a framework, too. In amaters-server, migration.rs adds a MigrationRegistry where you register version→version Migration steps; plan(from, to) runs a BFS shortest-path search to produce a MigrationPlan of zero-copy step references. The Migration trait exposes from_version / to_version / description / migrate(&mut MigrationContext), and MigrationContext wraps a serde_json::Value document (get / set / remove / into_doc) threaded through the steps so they compose without copying.
And for day-to-day debugging, the CLI gains an explain command in the REPL: it prints the QueryPlanner logical and physical plans locally, with no server round-trip, for get / set / delete / range, using amaters_core::compute::QueryPlanner to render a tree-formatted LogicalPlan + PhysicalPlan with cost estimates.
Getting Started
Add the crate (or pin the SDK explicitly) and start a server:
# Library / facade meta-crate
cargo add amaters
# or pin the Rust SDK directly:
# amaters-sdk-rust = "0.2.2"
# Start a node
cargo run --bin amaters-server -- start --data-dir ./data
Then poke at the new local query debugger inside the REPL:
cargo run --bin amaters-cli -- repl
# inside the REPL:
explain get my_key # prints the logical + physical query plan locally (no server round-trip)
To turn on distributed tracing, build with the telemetry feature and point the OTLP exporter at your collector; on Linux, enable the io-uring feature for the kernel-bypass WAL path:
cargo run --bin amaters-server --features "telemetry,io-uring" -- start --data-dir ./data
The Python SDK is fully async now, including pool stats. amaters.PoolStats (a #[pyclass], PyPoolStats under the hood) exposes .total_connections / .active_connections / .idle_connections / .max_connections, and both pool_stats() and close() are awaitable.
What’s New in 0.2.2
Sharding & Placement
- Horizontal sharding activated:
shard.rs+partitioner.rsmade public — consistent-hashing / range / hash partitioning,QueryRouter, k-wayResultMerger,ShardRegistry, andShardSplit/ShardMerge/ShardTransferlifecycle. - Stateless
PlacementCoordinator→ deterministicPlacementPlan(hot-shard split, cold-adjacent merge, imbalance rebalance). - Background
PlacementSchedulerruns on the Raft leader and proposesPlacementActions asClusterCommandlog entries; attach viaattach_placement_scheduler(). - Typed
ClusterCommandRaft log (7 variants:DataPut,DataDelete,PlaceSplit,PlaceMerge,PlaceTransfer,MembershipAdd,MembershipRemove), now postcard-encoded. - Chunked snapshot streaming via
SnapshotStreamer/SnapshotReceiverfor large state transfers.
Observability — OpenTelemetry
TelemetryConfig+TelemetryGuard(featuretelemetry): OTLP gRPC exporter, batchSdkTracerProvider, flush-on-drop.- W3C
TraceContextpropagation for gRPC (TraceparentExtractor,inject_trace_context,TraceContextPropagatorLayer) for end-to-end cross-node traces.
Storage — io_uring WAL + index automation
UringWalWriter(featureio-uring, Linux):tokio_uringkernel-bypass I/O on a dedicated runtime;Send + Sync + Clonehandle.tokio-uringis now acfg(target_os = "linux")conditional dep so macOS/Windows compile.- Secondary-index automation:
IndexExtractor/IndexedField/IndexManager::apply_extracted, maintained transparently underupdate_lockwith zero overhead when unattached.
Network — FHE CircuitCache
CircuitCache: thread-safe LRU (default 256 entries) keyed by blake3 hash of the predicate; FILTER + UPDATE sites callget_or_compile()to skip recompilation.
Security
- Constant-time API-key validation via
constant_time_eq— kills the timing side-channel oracle. pyo30.28.3 → 0.29 fixes RUSTSEC-2026-0176 (OOB read onnth/nth_backforPyList/PyTupleiterators) and RUSTSEC-2026-0177 (missingSyncbound onPyCFunction::new_closure) inamaters-sdk-python.
Cluster — alert RuleEngine
RuleEngineevaluatesAlertRules →AlertSeverity, dedups within a window, fansFiredAlerts out toAlertSinks (AlertSinktrait +LogSink).
Server — migration framework
MigrationRegistrywith BFS shortest-pathMigrationPlanover registeredMigrationsteps;MigrationContextoverserde_json::Value.
Python SDK — async
- Fully async
pool_stats+close(removes the lastblock_on);amaters.PoolStatsexported; 116 pytest cases across 5 files (10 Hypothesis property tests).
CLI — explain
explain <command>REPL debugger renders the localQueryPlannerlogical + physical plan with cost estimates.
Testing
- 2,224 tests passed / 29 skipped / 0 failed; 10 chaos scenarios (cluster), 5
#[ignore]load tests (server), 15 proptest cases (5 LSM, 5 shard, 5 placement), 116 pytest. - CircuitCache benchmarks added to the net bench suite; 9 cluster Criterion bench groups.
Dep bumps
- tokio 1.50 → 1.52, dashmap 6.1 → 6.2, rayon 1.11 → 1.12, serial_test 3.2 → 3.5, similar 3.1.0 → 3.1.1.
Fixed
KeyRange::midpoint()off-by-one for unequal-length keys — pad to max length, big-endian averaging with carry; nowstart ≤ mid < end.- Infinite loop in
detect_rebalance()whenn_shards < n_nodes— early exit + bound ofn_shards + 1iterations; placement proptests now run in under 1 ms. - Large-file refactors under the 2000-line policy:
wal.rs,server.rs(1941→1195),tls.rs(1954→1183),health.rs(1973→1196),optimizer.rs(1977→1174);node_testssplit.
Tips
- Turn on cross-node tracing. Build with the
telemetryfeature and point the OTLP exporter at your collector — theTraceContextPropagatorLayercarriestraceparentacross gRPC hops so one trace spans the whole cluster. - On Linux, flip on
io-uring. Theio-uringfeature routes the WAL throughUringWalWriter’s kernel-bypass path; tune it viaUringWalConfig(ring_size,batch_size,direct_io,channel_capacity). - Let hot shards split themselves. Attach the scheduler to your Raft node with
attach_placement_scheduler(), and the leader will propose splits/merges/transfers asClusterCommandentries automatically. - Inspect before you run. Use
explain <cmd>in the REPL to see a query’s logical + physical plan and cost estimates locally — no server round-trip required. - Lean on the circuit cache. The FHE
CircuitCache(default 256 entries) makes repeated identical predicates skip recompilation; it’s already wired into the FILTER and UPDATE sites. - Secondary indexes are opt-in and free when unused. Enable them with
with_index_manager/register_index; with no manager attached there’s zero overhead, and once attachedput/delete/atomic_updatekeep them consistent for you. - Stress it. The server load tests are
#[ignore]d by default — run them with--include-ignoredto drive the 1M put/get, 320k concurrent writer/reader, and mixed 80/20 scenarios.
This is the foundation
AmateRS sits inside the COOLJAPAN ecosystem (June 2026, full ecosystem), and 0.2.2 leans on it concretely: oxicode still handles serialization (no bincode in the default path), OxiARC provides LZ4 + DEFLATE compression, and postcard gives the new typed cluster log its compact, no_std-friendly encoding on the Raft replication path. The four mythic layers — Iwato (storage), Yata (FHE compute), Ukehi (consensus), Musubi (network) — now carry a sharding plane and an observability plane on top, all in Pure Rust.
Repository: https://github.com/cool-japan/amaters
Star the repo if you believe a database should be able to compute on your data without ever reading it. Sharding is live, traces are flowing, and the dark just got a lot more scalable.
— KitaSan at COOLJAPAN OÜ June 19, 2026