COOLJAPAN

Posts tagged #performance

1 posts

Apr 27, 2026 · 9 min

ToRSh 0.1.2 Released — Real AVX2/NEON SIMD and a Zero-Copy Tensor Memory Pool

ToRSh is a pure-Rust, PyTorch-compatible deep-learning framework with native tensor sharding. 0.1.2 lands real AVX2/NEON SIMD for f32 ops and activations, a true zero-copy buffer pool (100% heap-block reduction on hot loops), and SIMD + parallel enabled by default.

releasetorshdeep-learning