Thanks to visit codestin.com
Credit goes to lib.rs

#pytorch #matrix-decomposition #machine-learning #fft

bin+lib rustorch

Production-ready PyTorch-compatible deep learning library in Rust with special mathematical functions (gamma, Bessel, error functions), statistical distributions, Fourier transforms (FFT/RFFT), matrix decomposition (SVD/QR/LU/eigenvalue), automatic differentiation, neural networks, computer vision transforms, complete GPU acceleration (CUDA/Metal/OpenCL), SIMD optimizations, parallel processing, WebAssembly browser support, comprehensive distributed learning support, and performance validation

62 releases (5 breaking)

new 0.6.28 Oct 1, 2025
0.6.26 Sep 29, 2025

#170 in Machine learning

Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App

4,430 downloads per month

MIT/Apache

7.5MB
156K SLoC

Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App Codestin Search App

RusTorch 🚀

Crates.io Documentation License Tests Build

🌐 多言語ドキュメント / Multilingual Documentation

Language README Jupyter Guide
🇺🇸 English 📖 Main 📓 Jupyter
🇯🇵 日本語 📖 メイン 📓 Jupyter
🇫🇷 Français 📖 Principal 📓 Jupyter
🇮🇹 Italiano 📖 Principale 📓 Jupyter
🇪🇸 Español 📖 Principal 📓 Jupyter
🇨🇳 中文 📖 主要 📓 Jupyter
🇰🇷 한국어 📖 메인 📓 Jupyter
🇩🇪 Deutsch 📖 Hauptseite 📓 Jupyter
🇷🇺 Русский 📖 Основной 📓 Jupyter
🇵🇹 Português 📖 Principal 📓 Jupyter

A production-ready deep learning library in Rust with PyTorch-like API, GPU acceleration, and enterprise-grade performance
本番環境対応のRust製ディープラーニングライブラリ - PyTorchライクなAPI、GPU加速、エンタープライズグレードパフォーマンス

RusTorch is a fully functional deep learning library that leverages Rust's safety and performance. Phase 8 COMPLETED brings advanced tensor utilities with conditional operations, indexing, and statistical functions. Features comprehensive tensor operations, automatic differentiation, neural network layers, transformer architectures, multi-backend GPU acceleration (CUDA/Metal/OpenCL/CoreML), advanced SIMD optimizations, enterprise-grade memory management, data validation & quality assurance, and comprehensive debug & logging systems.

✨ Features

  • 🔥 Comprehensive Tensor Operations: Math operations, broadcasting, indexing, statistics, and Phase 8 advanced utilities
  • 🤖 Transformer Architecture: Complete transformer implementation with multi-head attention
  • 🧮 Matrix Decomposition: SVD, QR, eigenvalue decomposition with PyTorch compatibility
  • 🧠 Automatic Differentiation: Tape-based computational graph for gradient computation
  • 🚀 Dynamic Execution Engine: JIT compilation and runtime optimization
  • 🏗️ Neural Network Layers: Linear, Conv1d/2d/3d, ConvTranspose, RNN/LSTM/GRU, BatchNorm, Dropout, and more
  • Cross-Platform Optimizations: SIMD (AVX2/SSE/NEON), platform-specific, and hardware-aware optimizations
  • 🎮 GPU Integration: CUDA/Metal/OpenCL/CoreML support with automatic device selection
  • 🌐 WebAssembly Support: Complete browser ML with Neural Network layers, Computer Vision, and real-time inference
  • 🎮 WebGPU Integration: Chrome-optimized GPU acceleration with CPU fallback for cross-browser compatibility
  • 📁 Model Format Support: Safetensors, ONNX inference, PyTorch state dict compatibility
  • Production Ready: 1060 tests passing, unified error handling system with RusTorchError
  • 📐 Enhanced Mathematical Functions: Complete set of mathematical functions (exp, ln, sin, cos, tan, sqrt, abs, pow)
  • 🔧 Advanced Operator Overloads: Full operator support for tensors with scalar operations and in-place assignments
  • 📈 Advanced Optimizers: SGD, Adam, AdamW, RMSprop, AdaGrad with learning rate schedulers
  • 🚀 Phase 2 Optimization Framework: NAdam, RAdam, Adamax, Enhanced L-BFGS with 500%+ performance boost
  • World-Class Performance: Adamax 33,632 steps/sec, RAdam 21,939 steps/sec, NAdam 18,976 steps/sec
  • 🔍 Data Validation & Quality Assurance: Statistical analysis, anomaly detection, consistency checking, real-time monitoring
  • 🐛 Comprehensive Debug & Logging: Structured logging, performance profiling, memory tracking, automated alerts
  • 🎯 Phase 8 Tensor Utilities: Conditional operations (where, masked_select, masked_fill), indexing operations (gather, scatter, index_select), statistical operations (topk, kthvalue), and advanced utilities (unique, histogram)

For detailed features, see Features Documentation.

🚀 Quick Start

📓 Interactive Jupyter Demos

  • インストール (Install)
curl -sSL https://raw.githubusercontent.com/JunSuzukiJapan/rustorch/main/scripts/install_jupyter.sh | bash
  • 起動 (Launch)
rustorch-jupyter          # Global command / グローバルコマンド

あるいは (OR)

./start_jupyter_quick.sh  # Interactive menu / 対話式メニュー
  • コーディング�����!(Start coding!)

🎉 That's it! Your browser will open with Jupyter ready to use RusTorch!
🎉 これで完了!ブラウザでJupyterが開き、RusTorchを使う準備完了!


🚀 Manual Setup - Or choose specific demo type:

Demo Type Setup Command Description
🦀🐍 Hybrid ./start_jupyter_hybrid.sh Python + Rust dual-kernel environment
🐍 Python ./start_jupyter.sh Standard CPU-based ML demos
WebGPU ./start_jupyter_webgpu.sh Browser GPU acceleration (Chrome)
🦀 Rust Kernel ./scripts/quick_start_rust_kernel.sh Native Rust in Jupyter
🌐 Online Binder No setup needed - run in browser

📚 Detailed Setup: Complete Jupyter Guide | 日本語ガイド

Installation

Add this to your Cargo.toml:

[dependencies]
rustorch = "0.6.28"

# Optional features
[features]
default = ["linalg"]
linalg = ["rustorch/linalg"]           # Linear algebra operations (SVD, QR, eigenvalue)
cuda = ["rustorch/cuda"]
metal = ["rustorch/metal"]
opencl = ["rustorch/opencl"]
coreml = ["rustorch/coreml"]
safetensors = ["rustorch/safetensors"]
onnx = ["rustorch/onnx"]
wasm = ["rustorch/wasm"]                # WebAssembly support for browser ML
webgpu = ["rustorch/webgpu"]            # Chrome-optimized WebGPU acceleration

# To disable linalg features (avoid OpenBLAS/LAPACK dependencies):
rustorch = { version = "0.6.28", default-features = false }

Basic Usage

use rustorch::{tensor, tensor::Tensor};
use rustorch::optim::{SGD, WarmupScheduler, OneCycleLR, AnnealStrategy};
use rustorch::error::RusTorchResult;

fn main() -> RusTorchResult<()> {
    // Create tensors with convenient macro syntax
    let a = tensor!([[1, 2], [3, 4]]);
    let b = tensor!([[5, 6], [7, 8]]);

    // Basic operations with operator overloads
    let c = &a + &b;  // Element-wise addition
    let d = &a - &b;  // Element-wise subtraction
    let e = &a * &b;  // Element-wise multiplication
    let f = &a / &b;  // Element-wise division

    // Scalar operations
    let g = &a + 10.0;  // Add scalar to all elements
    let h = &a * 2.0;   // Multiply by scalar

    // Mathematical functions
    let exp_result = a.exp();   // Exponential function
    let ln_result = a.ln();     // Natural logarithm
    let sin_result = a.sin();   // Sine function
    let sqrt_result = a.sqrt(); // Square root

    // Matrix operations
    let matmul_result = a.matmul(&b);  // Matrix multiplication

    // Linear algebra operations (requires linalg feature)
    #[cfg(feature = "linalg")]
    {
        let svd_result = a.svd();       // SVD decomposition
        let qr_result = a.qr();         // QR decomposition
        let eig_result = a.eigh();      // Eigenvalue decomposition
    }

    // Advanced optimizers with learning rate scheduling
    let optimizer = SGD::new(0.01);
    let mut scheduler = WarmupScheduler::new(optimizer, 0.1, 5); // Warmup to 0.1 over 5 epochs

    // One-cycle learning rate policy
    let optimizer2 = SGD::new(0.01);
    let mut one_cycle = OneCycleLR::new(optimizer2, 1.0, 100, 0.3, AnnealStrategy::Cos);

    println!("Shape: {:?}", c.shape());
    println!("Result: {:?}", c.as_slice());

    Ok(())
}

WebAssembly Usage

For browser-based ML applications:

import init, * as rustorch from './pkg/rustorch.js';

async function browserML() {
    await init();
    
    // Neural network layers
    const linear = new rustorch.WasmLinear(784, 10, true);
    const conv = new rustorch.WasmConv2d(3, 32, 3, 1, 1, true);
    
    // Enhanced mathematical functions
    const gamma_result = rustorch.WasmSpecial.gamma_batch([1.5, 2.0, 2.5]);
    const bessel_result = rustorch.WasmSpecial.bessel_i_batch(0, [0.5, 1.0, 1.5]);
    
    // Statistical distributions
    const normal_dist = new rustorch.WasmDistributions();
    const samples = normal_dist.normal_sample_batch(100, 0.0, 1.0);
    
    // Optimizers for training
    const sgd = new rustorch.WasmOptimizer();
    sgd.sgd_init(0.01, 0.9); // learning_rate, momentum
    
    // Image processing
    const resized = rustorch.WasmVision.resize(image, 256, 256, 224, 224, 3);
    const normalized = rustorch.WasmVision.normalize(resized, [0.485, 0.456, 0.406], [0.229, 0.224, 0.225], 3);
    
    // Forward pass
    const predictions = conv.forward(normalized, 1, 224, 224);
    console.log('Browser ML predictions:', predictions);
}

WebGPU Acceleration (Chrome Optimized)

For Chrome browsers with WebGPU support:

import init, * as rustorch from './pkg/rustorch.js';

async function webgpuML() {
    await init();
    
    // Initialize WebGPU engine
    const webgpu = new rustorch.WebGPUSimple();
    await webgpu.initialize();
    
    // Check WebGPU support
    const supported = await webgpu.check_webgpu_support();
    if (supported) {
        console.log('🚀 WebGPU acceleration enabled');
        
        // High-performance tensor operations
        const a = [1, 2, 3, 4];
        const b = [5, 6, 7, 8];
        const result = webgpu.tensor_add_cpu(a, b);
        
        // Matrix multiplication with GPU optimization
        const matrix_a = new Array(256 * 256).fill(0).map(() => Math.random());
        const matrix_b = new Array(256 * 256).fill(0).map(() => Math.random());
        const matrix_result = webgpu.matrix_multiply_cpu(matrix_a, matrix_b, 256, 256, 256);
        
        console.log('WebGPU accelerated computation complete');
    } else {
        console.log('⚠️ WebGPU not supported, using CPU fallback');
    }
    
    // Interactive demo interface
    const demo = new rustorch.WebGPUSimpleDemo();
    demo.create_interface(); // Creates browser UI for testing
}

Run examples:

# Basic functionality examples
cargo run --example activation_demo --no-default-features
cargo run --example complex_tensor_demo --no-default-features  
cargo run --example neural_network_demo --no-default-features
cargo run --example autograd_demo --no-default-features

# Machine learning examples
cargo run --example boston_housing_regression --no-default-features
cargo run --example vision_pipeline_demo --no-default-features
cargo run --example embedding_demo --no-default-features

# Linear algebra examples (requires linalg feature)
cargo run --example eigenvalue_demo --features linalg
cargo run --example svd_demo --features linalg
cargo run --example matrix_decomposition_demo --features linalg

# Mathematical functions
cargo run --example special_functions_demo --no-default-features
cargo run --example fft_demo --no-default-features

For more examples, see Getting Started Guide and WebAssembly Guide.

📚 Documentation

WebAssembly & Browser ML

Production & Operations

📊 Performance

🏆 Phase 2 Performance Revolution - Up to 580% improvement:

Advanced Optimizer Performance (Release Mode)

Optimizer Performance Use Case Status
Adamax 33,632 steps/sec Sparse features, Embeddings FASTEST
RAdam 21,939 steps/sec 🚀 General deep learning, Stability RECOMMENDED
NAdam 18,976 steps/sec NLP, Fine-tuning NLP-OPTIMIZED

Core Operations Performance

Operation Performance Details
SVD Decomposition ~1ms (8x8 matrix) ✅ LAPACK-based
QR Decomposition ~24μs (8x8 matrix) ✅ Fast decomposition
Eigenvalue ~165μs (8x8 matrix) ✅ Symmetric matrices
Complex FFT 10-312μs (8-64 samples) ✅ Cooley-Tukey optimized
Neural Network 1-7s training ✅ Boston housing demo
Activation Functions <1μs ✅ ReLU, Sigmoid, Tanh

Phase 2 Architectural Benefits

  • 🧹 50%+ code reduction through unified GenericAdamOptimizer
  • 🔧 Consistent API across all Adam variants
  • ⚡ Shared optimizations benefiting all optimizers
  • 🛡️ Robust error handling with RusTorchResult

Run benchmarks:

# Basic benchmarks (no external dependencies)
cargo bench --no-default-features

# Linear algebra benchmarks (requires linalg feature) 
cargo bench --features linalg

# Phase 2 optimizer benchmarks (NEW!)
cargo run --bin quick_optimizer_bench --release

# Quick manual benchmark
cargo run --bin manual_quick_bench

For detailed performance analysis, see Performance Documentation.

🧪 Testing

1060 tests passing - Production-ready quality assurance with unified RusTorchError error handling system.

# Run all tests (recommended for CI/development)
cargo test --no-default-features

# Run tests with linear algebra features
cargo test --features linalg

# Run doctests
cargo test --doc --no-default-features

# Test specific modules
cargo test --no-default-features tensor::
cargo test --no-default-features complex::

🚀 Production Deployment

Docker

# Production deployment
docker build -t rustorch:latest .
docker run -it rustorch:latest

# GPU-enabled deployment
docker build -f docker/Dockerfile.gpu -t rustorch:gpu .
docker run --gpus all -it rustorch:gpu

For complete deployment guide, see Production Guide.

🤝 Contributing

We welcome contributions! See areas where help is especially needed:

  • 🎯 Special Functions Precision: Improve numerical accuracy
  • ⚡ Performance Optimization: SIMD improvements, GPU optimization
  • 🧪 Testing: More comprehensive test cases
  • 📚 Documentation: Examples, tutorials, improvements
  • 🌐 Platform Support: WebAssembly, mobile platforms

Development Setup

git clone https://github.com/JunSuzukiJapan/rustorch.git
cd rustorch

# Run tests
cargo test --all-features

# Check formatting
cargo fmt --check

# Run clippy
cargo clippy --all-targets --all-features

🔒 API Stability / API安定性

English: Starting from v0.6.0, RusTorch has reached a stable API milestone. We are committed to maintaining backward compatibility and will avoid major breaking changes in the near future. This ensures a reliable foundation for production applications and long-term projects.

日本語: v0.6.0以降、RusTorchは安定したAPIマイルストーンに達しました。後方互換性を維持することをお約束し、近い将来において大規模な破壊的変更は行いません。これにより、本番アプリケーションと長期プロジェクトに信頼性の高い基盤を提供します。

License

Licensed under either of:

at your option.

Dependencies

~10–67MB
~1M SLoC