|
| 1 | +# ADR-020: Migrate AI/Model Inference to Rust with RuVector and ONNX Runtime |
| 2 | + |
| 3 | +| Field | Value | |
| 4 | +|-------|-------| |
| 5 | +| **Status** | Accepted | |
| 6 | +| **Date** | 2026-02-28 | |
| 7 | +| **Deciders** | ruv | |
| 8 | +| **Relates to** | ADR-016 (RuVector Integration), ADR-017 (RuVector-Signal-MAT), ADR-019 (Sensing-Only UI) | |
| 9 | + |
| 10 | +## Context |
| 11 | + |
| 12 | +The current Python DensePose backend requires ~2GB+ of dependencies: |
| 13 | + |
| 14 | +| Python Dependency | Size | Purpose | |
| 15 | +|-------------------|------|---------| |
| 16 | +| PyTorch | ~2.0 GB | Neural network inference | |
| 17 | +| torchvision | ~500 MB | Model loading, transforms | |
| 18 | +| OpenCV | ~100 MB | Image processing | |
| 19 | +| SQLAlchemy + asyncpg | ~20 MB | Database | |
| 20 | +| scikit-learn | ~50 MB | Classification | |
| 21 | +| **Total** | **~2.7 GB** | | |
| 22 | + |
| 23 | +This makes the DensePose backend impractical for edge deployments, CI pipelines, and developer laptops where users only need WiFi sensing + pose estimation. |
| 24 | + |
| 25 | +Meanwhile, the Rust port at `rust-port/wifi-densepose-rs/` already has: |
| 26 | + |
| 27 | +- **12 workspace crates** covering core, signal, nn, api, db, config, hardware, wasm, cli, mat, train |
| 28 | +- **5 RuVector crates** (v2.0.4, published on crates.io) integrated into signal, mat, and train crates |
| 29 | +- **3 NN backends**: ONNX Runtime (default), tch (PyTorch C++), Candle (pure Rust) |
| 30 | +- **Axum web framework** with WebSocket support in the MAT crate |
| 31 | +- **Signal processing pipeline**: CSI processor, BVP, Fresnel geometry, spectrogram, subcarrier selection, motion detection, Hampel filter, phase sanitizer |
| 32 | + |
| 33 | +## Decision |
| 34 | + |
| 35 | +Adopt the Rust workspace as the **primary backend** for AI/model inference and signal processing, replacing the Python FastAPI stack for production deployments. |
| 36 | + |
| 37 | +### Phase 1: ONNX Runtime Default (No libtorch) |
| 38 | + |
| 39 | +Use the `wifi-densepose-nn` crate with `default-features = ["onnx"]` only. This avoids the libtorch C++ dependency entirely. |
| 40 | + |
| 41 | +| Component | Rust Crate | Replaces Python | |
| 42 | +|-----------|-----------|-----------------| |
| 43 | +| CSI processing | `wifi-densepose-signal::csi_processor` | `v1/src/sensing/feature_extractor.py` | |
| 44 | +| Motion detection | `wifi-densepose-signal::motion` | `v1/src/sensing/classifier.py` | |
| 45 | +| BVP extraction | `wifi-densepose-signal::bvp` | N/A (new capability) | |
| 46 | +| Fresnel geometry | `wifi-densepose-signal::fresnel` | N/A (new capability) | |
| 47 | +| Subcarrier selection | `wifi-densepose-signal::subcarrier_selection` | N/A (new capability) | |
| 48 | +| Spectrogram | `wifi-densepose-signal::spectrogram` | N/A (new capability) | |
| 49 | +| Pose inference | `wifi-densepose-nn::onnx` | PyTorch + torchvision | |
| 50 | +| DensePose mapping | `wifi-densepose-nn::densepose` | Python DensePose | |
| 51 | +| REST API | `wifi-densepose-mat::api` (Axum) | FastAPI | |
| 52 | +| WebSocket stream | `wifi-densepose-mat::api::websocket` | `ws_server.py` | |
| 53 | +| Survivor detection | `wifi-densepose-mat::detection` | N/A (new capability) | |
| 54 | +| Vital signs | `wifi-densepose-mat::ml` | N/A (new capability) | |
| 55 | + |
| 56 | +### Phase 2: RuVector Signal Intelligence |
| 57 | + |
| 58 | +The 5 RuVector crates provide subpolynomial algorithms already wired into the Rust signal pipeline: |
| 59 | + |
| 60 | +| Crate | Algorithm | Use in Pipeline | |
| 61 | +|-------|-----------|-----------------| |
| 62 | +| `ruvector-mincut` | Subpolynomial min-cut | Dynamic subcarrier partitioning (sensitive vs insensitive) | |
| 63 | +| `ruvector-attn-mincut` | Attention-gated min-cut | Noise-suppressed spectrogram generation | |
| 64 | +| `ruvector-attention` | Sensitivity-weighted attention | Body velocity profile extraction | |
| 65 | +| `ruvector-solver` | Sparse Fresnel solver | TX-body-RX distance estimation | |
| 66 | +| `ruvector-temporal-tensor` | Compressed temporal buffers | Breathing + heartbeat spectrogram storage | |
| 67 | + |
| 68 | +These replace the Python `RssiFeatureExtractor` with hardware-aware, subcarrier-level feature extraction. |
| 69 | + |
| 70 | +### Phase 3: Unified Axum Server |
| 71 | + |
| 72 | +Replace both the Python FastAPI backend (port 8000) and the Python sensing WebSocket (port 8765) with a single Rust Axum server: |
| 73 | + |
| 74 | +``` |
| 75 | +ESP32 (UDP :5005) ──▶ Rust Axum server (:8000) ──▶ UI (browser) |
| 76 | + ├── /health/* (health checks) |
| 77 | + ├── /api/v1/pose/* (pose estimation) |
| 78 | + ├── /api/v1/stream/* (WebSocket pose stream) |
| 79 | + ├── /ws/sensing (sensing WebSocket — replaces :8765) |
| 80 | + └── /ws/mat/stream (MAT domain events) |
| 81 | +``` |
| 82 | + |
| 83 | +### Build Configuration |
| 84 | + |
| 85 | +```toml |
| 86 | +# Lightweight build — no libtorch, no OpenBLAS |
| 87 | +cargo build --release -p wifi-densepose-mat --no-default-features --features "std,api,onnx" |
| 88 | + |
| 89 | +# Full build with all backends |
| 90 | +cargo build --release --features "all-backends" |
| 91 | +``` |
| 92 | + |
| 93 | +### Dependency Comparison |
| 94 | + |
| 95 | +| | Python Backend | Rust Backend (ONNX only) | |
| 96 | +|---|---|---| |
| 97 | +| Install size | ~2.7 GB | ~50 MB binary | |
| 98 | +| Runtime memory | ~500 MB | ~20 MB | |
| 99 | +| Startup time | 3-5s | <100ms | |
| 100 | +| Dependencies | 30+ pip packages | Single static binary | |
| 101 | +| GPU support | CUDA via PyTorch | CUDA via ONNX Runtime | |
| 102 | +| Model format | .pt/.pth (PyTorch) | .onnx (portable) | |
| 103 | +| Cross-compile | Difficult | `cargo build --target` | |
| 104 | +| WASM target | No | Yes (`wifi-densepose-wasm`) | |
| 105 | + |
| 106 | +### Model Conversion |
| 107 | + |
| 108 | +Export existing PyTorch models to ONNX for the Rust backend: |
| 109 | + |
| 110 | +```python |
| 111 | +# One-time conversion (Python) |
| 112 | +import torch |
| 113 | +model = torch.load("model.pth") |
| 114 | +torch.onnx.export(model, dummy_input, "model.onnx", opset_version=17) |
| 115 | +``` |
| 116 | + |
| 117 | +The `wifi-densepose-nn::onnx` module loads `.onnx` files directly. |
| 118 | + |
| 119 | +## Consequences |
| 120 | + |
| 121 | +### Positive |
| 122 | +- Single ~50MB static binary replaces ~2.7GB Python environment |
| 123 | +- ~20MB runtime memory vs ~500MB |
| 124 | +- Sub-100ms startup vs 3-5 seconds |
| 125 | +- Single port serves all endpoints (API, WebSocket sensing, WebSocket pose) |
| 126 | +- RuVector subpolynomial algorithms run natively (no FFI overhead) |
| 127 | +- WASM build target enables browser-side inference |
| 128 | +- Cross-compilation for ARM (Raspberry Pi), ESP32-S3, etc. |
| 129 | + |
| 130 | +### Negative |
| 131 | +- ONNX model conversion required (one-time step per model) |
| 132 | +- Developers need Rust toolchain for backend changes |
| 133 | +- Python sensing pipeline (`ws_server.py`) remains useful for rapid prototyping |
| 134 | +- `ndarray-linalg` requires OpenBLAS or system LAPACK for some signal crates |
| 135 | + |
| 136 | +### Migration Path |
| 137 | +1. Keep Python `ws_server.py` as fallback for development/prototyping |
| 138 | +2. Build Rust binary with `cargo build --release -p wifi-densepose-mat` |
| 139 | +3. UI detects which backend is running and adapts (existing `sensingOnlyMode` logic) |
| 140 | +4. Deprecate Python backend once Rust API reaches feature parity |
| 141 | + |
| 142 | +## Verification |
| 143 | + |
| 144 | +```bash |
| 145 | +# Build the Rust workspace (ONNX-only, no libtorch) |
| 146 | +cd rust-port/wifi-densepose-rs |
| 147 | +cargo check --workspace 2>&1 |
| 148 | + |
| 149 | +# Build release binary |
| 150 | +cargo build --release -p wifi-densepose-mat --no-default-features --features "std,api" |
| 151 | + |
| 152 | +# Run tests |
| 153 | +cargo test --workspace |
| 154 | + |
| 155 | +# Binary size |
| 156 | +ls -lh target/release/wifi-densepose-mat |
| 157 | +``` |
0 commit comments