|
| 1 | +# ADR-028: Project RuView -- Sensing-First RF Mode for Multistatic Fidelity Enhancement |
| 2 | + |
| 3 | +| Field | Value | |
| 4 | +|-------|-------| |
| 5 | +| **Status** | Proposed | |
| 6 | +| **Date** | 2026-03-02 | |
| 7 | +| **Deciders** | ruv | |
| 8 | +| **Codename** | **RuView** -- RuVector Viewpoint-Integrated Enhancement | |
| 9 | +| **Relates to** | ADR-012 (ESP32 Mesh), ADR-014 (SOTA Signal), ADR-016 (RuVector Integration), ADR-017 (RuVector Signal+MAT), ADR-021 (Vital Signs), ADR-024 (AETHER Embeddings), ADR-027 (MERIDIAN Cross-Environment) | |
| 10 | + |
| 11 | +--- |
| 12 | + |
| 13 | +## 1. Context |
| 14 | + |
| 15 | +### 1.1 The Single-Viewpoint Fidelity Ceiling |
| 16 | + |
| 17 | +Current WiFi DensePose operates with a single transmitter-receiver pair (or single node receiving). This creates three fundamental limitations: |
| 18 | + |
| 19 | +- **Body self-occlusion**: Limbs behind the torso are invisible to a single viewpoint. |
| 20 | +- **Depth ambiguity**: Motion along the RF propagation axis (toward/away from receiver) produces minimal phase change. |
| 21 | +- **Multi-person confusion**: Two people at similar range but different angles create overlapping CSI signatures. |
| 22 | + |
| 23 | +The ESP32 mesh (ADR-012) partially addresses this via feature-level fusion across 3-6 nodes, but feature-level fusion cannot learn optimal fusion weights -- it uses hand-crafted aggregation (max, mean, coherent sum). |
| 24 | + |
| 25 | +### 1.2 Three Fidelity Levers |
| 26 | + |
| 27 | +1. **Bandwidth**: More bandwidth produces better multipath separability. Currently limited to 20 MHz (ESP32 HT20). Wider channels (80/160 MHz) are available on commodity 802.11ac/ax APs. |
| 28 | +2. **Carrier frequency**: Higher frequency produces more phase sensitivity. 2.4 GHz sees macro-motion; 5 GHz sees micro-motion; 60 GHz sees vital signs. |
| 29 | +3. **Viewpoints**: More viewpoints from different angles reduces geometric ambiguity. This is the lever RuView pulls. |
| 30 | + |
| 31 | +### 1.3 Why "Sensing-First RF Mode" |
| 32 | + |
| 33 | +RuView is NOT a new WiFi standard. It is a sensing-first protocol that rides on existing silicon, bands, and regulations. The key insight: instead of upgrading the RF hardware, upgrade the observability by coordinating multiple commodity receivers. |
| 34 | + |
| 35 | +### 1.4 What Already Exists |
| 36 | + |
| 37 | +| Component | ADR | Current State | |
| 38 | +|-----------|-----|---------------| |
| 39 | +| ESP32 mesh with feature-level fusion | ADR-012 | Implemented (firmware + aggregator) | |
| 40 | +| SOTA signal processing (Hampel, Fresnel, BVP, spectrogram) | ADR-014 | Implemented | |
| 41 | +| RuVector training pipeline (5 crates) | ADR-016 | Complete | |
| 42 | +| RuVector signal + MAT integration (7 points) | ADR-017 | Accepted | |
| 43 | +| Vital sign detection pipeline | ADR-021 | Partially implemented | |
| 44 | +| AETHER contrastive embeddings | ADR-024 | Proposed | |
| 45 | +| MERIDIAN cross-environment generalization | ADR-027 | Proposed | |
| 46 | + |
| 47 | +RuView fills the gap: **cross-viewpoint embedding fusion** using learned attention weights. |
| 48 | + |
| 49 | +--- |
| 50 | + |
| 51 | +## 2. Decision |
| 52 | + |
| 53 | +Introduce RuView as a cross-viewpoint embedding fusion layer that operates on top of AETHER per-viewpoint embeddings. RuView adds a new bounded context (ViewpointFusion) and extends three existing crates. |
| 54 | + |
| 55 | +### 2.1 Core Architecture |
| 56 | + |
| 57 | +``` |
| 58 | ++-----------------------------------------------------------------+ |
| 59 | +| RuView Multistatic Pipeline | |
| 60 | ++-----------------------------------------------------------------+ |
| 61 | +| | |
| 62 | +| +----------+ +----------+ +----------+ +----------+ | |
| 63 | +| | Node 1 | | Node 2 | | Node 3 | | Node N | | |
| 64 | +| | ESP32-S3 | | ESP32-S3 | | ESP32-S3 | | ESP32-S3 | | |
| 65 | +| | | | | | | | | | |
| 66 | +| | CSI Rx | | CSI Rx | | CSI Rx | | CSI Rx | | |
| 67 | +| +----+-----+ +----+-----+ +----+-----+ +----+-----+ | |
| 68 | +| | | | | | |
| 69 | +| v v v v | |
| 70 | +| +--------------------------------------------------------+ | |
| 71 | +| | Per-Viewpoint Signal Processing | | |
| 72 | +| | Phase sanitize -> Hampel -> BVP -> Subcarrier select | | |
| 73 | +| | (ADR-014, unchanged per viewpoint) | | |
| 74 | +| +----------------------------+---------------------------+ | |
| 75 | +| | | |
| 76 | +| v | |
| 77 | +| +--------------------------------------------------------+ | |
| 78 | +| | Per-Viewpoint AETHER Embedding | | |
| 79 | +| | CsiToPoseTransformer -> 128-d contrastive embedding | | |
| 80 | +| | (ADR-024, one per viewpoint) | | |
| 81 | +| +----------------------------+---------------------------+ | |
| 82 | +| | | |
| 83 | +| [emb_1, emb_2, ..., emb_N] | |
| 84 | +| | | |
| 85 | +| v | |
| 86 | +| +--------------------------------------------------------+ | |
| 87 | +| | * RuView Cross-Viewpoint Fusion * | | |
| 88 | +| | | | |
| 89 | +| | Q = W_q * X, K = W_k * X, V = W_v * X | | |
| 90 | +| | A = softmax((QK^T + G_bias) / sqrt(d)) | | |
| 91 | +| | fused = A * V | | |
| 92 | +| | | | |
| 93 | +| | G_bias: geometric bias from viewpoint pair geometry | | |
| 94 | +| | (ruvector-attention: ScaledDotProductAttention) | | |
| 95 | +| +----------------------------+---------------------------+ | |
| 96 | +| | | |
| 97 | +| fused_embedding | |
| 98 | +| | | |
| 99 | +| v | |
| 100 | +| +--------------------------------------------------------+ | |
| 101 | +| | DensePose Regression Head | | |
| 102 | +| | Keypoint head: [B,17,H,W] | | |
| 103 | +| | Part/UV head: [B,25,H,W] + [B,48,H,W] | | |
| 104 | +| +--------------------------------------------------------+ | |
| 105 | ++-----------------------------------------------------------------+ |
| 106 | +``` |
| 107 | + |
| 108 | +### 2.2 TDM Sensing Protocol |
| 109 | + |
| 110 | +- Coordinator (aggregator) broadcasts sync beacon at start of each cycle. |
| 111 | +- Each node transmits in assigned time slot; all others receive. |
| 112 | +- 6 nodes x 1.4 ms/slot = 8.4 ms cycle -> ~119 Hz aggregate, ~20 Hz per bistatic pair. |
| 113 | +- Clock drift handled at feature level (no cross-node phase alignment). |
| 114 | + |
| 115 | +### 2.3 Geometric Bias Matrix |
| 116 | + |
| 117 | +The geometric bias `G_bias` encodes the spatial relationship between viewpoint pairs: |
| 118 | + |
| 119 | +``` |
| 120 | +G_bias[i,j] = w_angle * cos(theta_ij) + w_dist * exp(-d_ij / d_ref) |
| 121 | +``` |
| 122 | + |
| 123 | +where: |
| 124 | + |
| 125 | +- `theta_ij` = angle between viewpoint i and viewpoint j (from room center) |
| 126 | +- `d_ij` = baseline distance between node i and node j |
| 127 | +- `w_angle`, `w_dist` = learnable weights |
| 128 | +- `d_ref` = reference distance (room diagonal / 2) |
| 129 | + |
| 130 | +This allows the attention mechanism to learn that widely-separated, orthogonal viewpoints are more complementary than clustered ones. |
| 131 | + |
| 132 | +### 2.4 Coherence-Gated Environment Updates |
| 133 | + |
| 134 | +```rust |
| 135 | +/// Only update environment model when phase coherence exceeds threshold. |
| 136 | +pub fn coherence_gate( |
| 137 | + phase_diffs: &[f32], // delta-phi over T recent frames |
| 138 | + threshold: f32, // typically 0.7 |
| 139 | +) -> bool { |
| 140 | + // Complex mean of unit phasors |
| 141 | + let (sum_cos, sum_sin) = phase_diffs.iter() |
| 142 | + .fold((0.0f32, 0.0f32), |(c, s), &dp| { |
| 143 | + (c + dp.cos(), s + dp.sin()) |
| 144 | + }); |
| 145 | + let n = phase_diffs.len() as f32; |
| 146 | + let coherence = ((sum_cos / n).powi(2) + (sum_sin / n).powi(2)).sqrt(); |
| 147 | + coherence > threshold |
| 148 | +} |
| 149 | +``` |
| 150 | + |
| 151 | +### 2.5 Two Implementation Paths |
| 152 | + |
| 153 | +| Path | Hardware | Bandwidth | Per-Viewpoint Rate | Target Tier | |
| 154 | +|------|----------|-----------|-------------------|-------------| |
| 155 | +| **ESP32 Multistatic** | 6x ESP32-S3 ($84) | 20 MHz (HT20) | 20 Hz | Silver | |
| 156 | +| **Cognitum + RF** | Cognitum v1 + LimeSDR | 20-160 MHz | 20-100 Hz | Gold | |
| 157 | + |
| 158 | +ESP32 path: commodity, achievable today, targets Silver tier (tracking + pose quality). |
| 159 | +Cognitum path: higher fidelity, targets Gold tier (tracking + pose + vitals). |
| 160 | + |
| 161 | +--- |
| 162 | + |
| 163 | +## 3. DDD Design |
| 164 | + |
| 165 | +### 3.1 New Bounded Context: ViewpointFusion |
| 166 | + |
| 167 | +**Aggregate Root: `MultistaticArray`** |
| 168 | + |
| 169 | +```rust |
| 170 | +pub struct MultistaticArray { |
| 171 | + /// Unique array deployment ID |
| 172 | + id: ArrayId, |
| 173 | + /// Viewpoint geometry (node positions, orientations) |
| 174 | + geometry: ArrayGeometry, |
| 175 | + /// TDM schedule (slot assignments, cycle period) |
| 176 | + schedule: TdmSchedule, |
| 177 | + /// Active viewpoint embeddings (latest per node) |
| 178 | + viewpoints: Vec<ViewpointEmbedding>, |
| 179 | + /// Fused output embedding |
| 180 | + fused: Option<FusedEmbedding>, |
| 181 | + /// Coherence gate state |
| 182 | + coherence_state: CoherenceState, |
| 183 | +} |
| 184 | +``` |
| 185 | + |
| 186 | +**Entity: `ViewpointEmbedding`** |
| 187 | + |
| 188 | +```rust |
| 189 | +pub struct ViewpointEmbedding { |
| 190 | + /// Source node ID |
| 191 | + node_id: NodeId, |
| 192 | + /// AETHER embedding vector (128-d) |
| 193 | + embedding: Vec<f32>, |
| 194 | + /// Geometric metadata |
| 195 | + azimuth: f32, // radians from array center |
| 196 | + elevation: f32, // radians |
| 197 | + baseline: f32, // meters from centroid |
| 198 | + /// Capture timestamp |
| 199 | + timestamp: Instant, |
| 200 | + /// Signal quality |
| 201 | + snr_db: f32, |
| 202 | +} |
| 203 | +``` |
| 204 | + |
| 205 | +**Value Object: `GeometricDiversityIndex`** |
| 206 | + |
| 207 | +```rust |
| 208 | +pub struct GeometricDiversityIndex { |
| 209 | + /// GDI = (1/N) sum min_{j!=i} |theta_i - theta_j| |
| 210 | + value: f32, |
| 211 | + /// Effective independent viewpoints (after correlation discount) |
| 212 | + n_effective: f32, |
| 213 | + /// Worst viewpoint pair (most redundant) |
| 214 | + worst_pair: (NodeId, NodeId), |
| 215 | +} |
| 216 | +``` |
| 217 | + |
| 218 | +**Domain Events:** |
| 219 | + |
| 220 | +```rust |
| 221 | +pub enum ViewpointFusionEvent { |
| 222 | + ViewpointCaptured { node_id: NodeId, timestamp: Instant, snr_db: f32 }, |
| 223 | + TdmCycleCompleted { cycle_id: u64, viewpoints_received: usize }, |
| 224 | + FusionCompleted { fused_embedding: Vec<f32>, gdi: f32 }, |
| 225 | + CoherenceGateTriggered { coherence: f32, accepted: bool }, |
| 226 | + GeometryUpdated { new_gdi: f32, n_effective: f32 }, |
| 227 | +} |
| 228 | +``` |
| 229 | + |
| 230 | +### 3.2 Extended Bounded Contexts |
| 231 | + |
| 232 | +**Signal (wifi-densepose-signal):** |
| 233 | +- New service: `CrossViewpointSubcarrierSelection` |
| 234 | + - Consensus sensitive subcarrier set across all viewpoints via ruvector-mincut. |
| 235 | + - Input: per-viewpoint sensitivity scores. Output: globally-sensitive + locally-sensitive partition. |
| 236 | + |
| 237 | +**Hardware (wifi-densepose-hardware):** |
| 238 | +- New protocol: `TdmSensingProtocol` |
| 239 | + - Coordinator logic: beacon generation, slot scheduling, clock drift compensation. |
| 240 | + - Event: `TdmSlotCompleted { node_id, slot_index, capture_quality }` |
| 241 | + |
| 242 | +**Training (wifi-densepose-train):** |
| 243 | +- New module: `ruview_metrics.rs` |
| 244 | + - Three-metric acceptance test: PCK/OKS (joint error), MOTA (multi-person separation), vital sign accuracy. |
| 245 | + - Tiered pass/fail: Bronze/Silver/Gold. |
| 246 | + |
| 247 | +--- |
| 248 | + |
| 249 | +## 4. Implementation Plan (File-Level) |
| 250 | + |
| 251 | +### 4.1 Phase 1: ViewpointFusion Core (New Files) |
| 252 | + |
| 253 | +| File | Purpose | RuVector Crate | |
| 254 | +|------|---------|---------------| |
| 255 | +| `crates/wifi-densepose-ruvector/src/viewpoint/mod.rs` | Module root, re-exports | -- | |
| 256 | +| `crates/wifi-densepose-ruvector/src/viewpoint/attention.rs` | Cross-viewpoint scaled dot-product attention with geometric bias | ruvector-attention | |
| 257 | +| `crates/wifi-densepose-ruvector/src/viewpoint/geometry.rs` | GeometricDiversityIndex, Cramer-Rao bound estimation | ruvector-solver | |
| 258 | +| `crates/wifi-densepose-ruvector/src/viewpoint/coherence.rs` | Coherence gating for environment stability | -- (pure math) | |
| 259 | +| `crates/wifi-densepose-ruvector/src/viewpoint/fusion.rs` | MultistaticArray aggregate, orchestrates fusion pipeline | ruvector-attention + ruvector-attn-mincut | |
| 260 | + |
| 261 | +### 4.2 Phase 2: Signal Processing Extension |
| 262 | + |
| 263 | +| File | Purpose | RuVector Crate | |
| 264 | +|------|---------|---------------| |
| 265 | +| `crates/wifi-densepose-signal/src/cross_viewpoint.rs` | Cross-viewpoint subcarrier consensus via min-cut | ruvector-mincut | |
| 266 | + |
| 267 | +### 4.3 Phase 3: Hardware Protocol Extension |
| 268 | + |
| 269 | +| File | Purpose | RuVector Crate | |
| 270 | +|------|---------|---------------| |
| 271 | +| `crates/wifi-densepose-hardware/src/esp32/tdm.rs` | TDM sensing protocol coordinator | -- (protocol logic) | |
| 272 | + |
| 273 | +### 4.4 Phase 4: Training and Metrics |
| 274 | + |
| 275 | +| File | Purpose | RuVector Crate | |
| 276 | +|------|---------|---------------| |
| 277 | +| `crates/wifi-densepose-train/src/ruview_metrics.rs` | Three-metric acceptance test (PCK/OKS, MOTA, vital sign accuracy) | ruvector-mincut (person matching) | |
| 278 | + |
| 279 | +--- |
| 280 | + |
| 281 | +## 5. Three-Metric Acceptance Test |
| 282 | + |
| 283 | +### 5.1 Metric 1: Joint Error (PCK / OKS) |
| 284 | + |
| 285 | +| Criterion | Threshold | |
| 286 | +|-----------|-----------| |
| 287 | +| [email protected] (all 17 keypoints) | >= 0.70 | |
| 288 | +| [email protected] (torso: shoulders + hips) | >= 0.80 | |
| 289 | +| Mean OKS | >= 0.50 | |
| 290 | +| Torso jitter RMS (10s window) | < 3 cm | |
| 291 | +| Per-keypoint max error (95th percentile) | < 15 cm | |
| 292 | + |
| 293 | +### 5.2 Metric 2: Multi-Person Separation |
| 294 | + |
| 295 | +| Criterion | Threshold | |
| 296 | +|-----------|-----------| |
| 297 | +| Subjects | 2 | |
| 298 | +| Capture rate | 20 Hz | |
| 299 | +| Track duration | 10 minutes | |
| 300 | +| Identity swaps (MOTA ID-switch) | 0 | |
| 301 | +| Track fragmentation ratio | < 0.05 | |
| 302 | +| False track creation | 0/min | |
| 303 | + |
| 304 | +### 5.3 Metric 3: Vital Sign Sensitivity |
| 305 | + |
| 306 | +| Criterion | Threshold | |
| 307 | +|-----------|-----------| |
| 308 | +| Breathing detection (6-30 BPM) | +/- 2 BPM | |
| 309 | +| Breathing band SNR (0.1-0.5 Hz) | >= 6 dB | |
| 310 | +| Heartbeat detection (40-120 BPM) | +/- 5 BPM (aspirational) | |
| 311 | +| Heartbeat band SNR (0.8-2.0 Hz) | >= 3 dB (aspirational) | |
| 312 | +| Micro-motion resolution | 1 mm at 3m | |
| 313 | + |
| 314 | +### 5.4 Tiered Pass/Fail |
| 315 | + |
| 316 | +| Tier | Requirements | Deployment Gate | |
| 317 | +|------|-------------|-----------------| |
| 318 | +| Bronze | Metric 2 | Prototype demo | |
| 319 | +| Silver | Metrics 1 + 2 | Production candidate | |
| 320 | +| Gold | All three | Full deployment | |
| 321 | + |
| 322 | +--- |
| 323 | + |
| 324 | +## 6. Consequences |
| 325 | + |
| 326 | +### 6.1 Positive |
| 327 | + |
| 328 | +- **Fundamental geometric improvement**: Viewpoint diversity reduces body self-occlusion and depth ambiguity -- these are physics, not model, limitations. |
| 329 | +- **Uses existing silicon**: ESP32-S3, commodity WiFi, no custom RF hardware required for Silver tier. |
| 330 | +- **Learned fusion weights**: Embedding-level fusion (Tier 3) outperforms hand-crafted feature-level fusion (Tier 2). |
| 331 | +- **Composes with existing ADRs**: AETHER (per-viewpoint), MERIDIAN (cross-environment), and RuView (cross-viewpoint) are orthogonal -- they compose freely. |
| 332 | +- **IEEE 802.11bf aligned**: TDM protocol maps to 802.11bf sensing sessions, enabling future migration to standard-compliant APs. |
| 333 | +- **Commodity price point**: $84 for 6-node Silver-tier deployment. |
| 334 | + |
| 335 | +### 6.2 Negative |
| 336 | + |
| 337 | +- **TDM rate reduction**: N viewpoints leads to per-viewpoint rate divided by N. With 6 nodes at 120 Hz aggregate, each viewpoint sees 20 Hz. |
| 338 | +- **More complex aggregator**: Embedding fusion + geometric bias learning adds ~25K parameters on top of per-viewpoint AETHER model. |
| 339 | +- **Placement planning required**: Geometric Diversity Index optimization requires intentional node placement (not random scatter). |
| 340 | +- **Clock drift limits TDM precision**: ESP32 crystal drift (20-50 ppm) limits slot precision to ~1 ms, which is sufficient for feature-level fusion but not signal-level coherent combining. |
| 341 | +- **Training data**: Cross-viewpoint training requires multi-receiver CSI captures, which are not available in existing public datasets (MM-Fi, Wi-Pose). |
| 342 | + |
| 343 | +### 6.3 Interaction with Other ADRs |
| 344 | + |
| 345 | +| ADR | Interaction | |
| 346 | +|-----|------------| |
| 347 | +| ADR-012 (ESP32 Mesh) | RuView extends the aggregator from feature-level to embedding-level fusion; TDM protocol replaces simple UDP collection | |
| 348 | +| ADR-014 (SOTA Signal) | Per-viewpoint signal processing is unchanged; cross-viewpoint subcarrier consensus is new | |
| 349 | +| ADR-016/017 (RuVector) | All 5 ruvector crates get new cross-viewpoint operations (see Section 4) | |
| 350 | +| ADR-021 (Vital Signs) | Multi-viewpoint SNR improvement directly benefits vital sign extraction (Gold tier target) | |
| 351 | +| ADR-024 (AETHER) | Per-viewpoint AETHER embeddings are the input to RuView fusion; AETHER is required | |
| 352 | +| ADR-027 (MERIDIAN) | Cross-environment (MERIDIAN) and cross-viewpoint (RuView) are orthogonal; MERIDIAN handles room transfer, RuView handles within-room geometry | |
| 353 | + |
| 354 | +--- |
| 355 | + |
| 356 | +## 7. References |
| 357 | + |
| 358 | +1. IEEE 802.11bf (2024). "WLAN Sensing." IEEE Standards Association. |
| 359 | +2. Kotaru, M. et al. (2015). "SpotFi: Decimeter Level Localization Using WiFi." SIGCOMM 2015. |
| 360 | +3. Zeng, Y. et al. (2019). "FarSense: Pushing the Range Limit of WiFi-based Respiration Sensing with CSI Ratio of Two Antennas." MobiCom 2019. |
| 361 | +4. Zheng, Y. et al. (2019). "Zero-Effort Cross-Domain Gesture Recognition with Wi-Fi." (Widar 3.0) MobiSys 2019. |
| 362 | +5. Yan, K. et al. (2024). "Person-in-WiFi 3D: End-to-End Multi-Person 3D Pose Estimation with Wi-Fi." CVPR 2024. |
| 363 | +6. Zhou, Y. et al. (2024). "AdaPose: Towards Cross-Site Device-Free Human Pose Estimation with Commodity WiFi." IEEE IoT Journal. arXiv:2309.16964. |
| 364 | +7. Zhou, R. et al. (2025). "DGSense: A Domain Generalization Framework for Wireless Sensing." arXiv:2502.08155. |
| 365 | +8. Chen, X. & Yang, J. (2025). "X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing." ICLR 2025. arXiv:2410.10167. |
| 366 | +9. AM-FM (2026). "AM-FM: A Foundation Model for Ambient Intelligence Through WiFi." arXiv:2602.11200. |
| 367 | +10. Chen, L. et al. (2026). "PerceptAlign: Breaking Coordinate Overfitting." arXiv:2601.12252. |
| 368 | +11. Li, J. & Stoica, P. (2007). "MIMO Radar with Colocated Antennas." IEEE Signal Processing Magazine, 24(5):106-114. |
| 369 | +12. ADR-012 through ADR-027 (internal). |
0 commit comments