22
33## Status
44
5- Proposed
5+ Accepted
66
77## Context
88
@@ -25,119 +25,156 @@ the camera is removed. This means any dataset that provides *either* ground-trut
2525pose annotations * or* synchronized RGB frames (from which a teacher can generate
2626labels) is sufficient for training.
2727
28+ ### 56-Subcarrier Hardware Context
29+
30+ The system targets 56 subcarriers, which corresponds specifically to ** Atheros 802.11n
31+ chipsets on a 20 MHz channel** using the Atheros CSI Tool. No publicly available
32+ dataset with paired pose annotations was collected at exactly 56 subcarriers:
33+
34+ | Hardware | Subcarriers | Datasets |
35+ | ----------| -------------| ---------|
36+ | Atheros CSI Tool (20 MHz) | ** 56** | None with pose labels |
37+ | Atheros CSI Tool (40 MHz) | ** 114** | MM-Fi |
38+ | Intel 5300 NIC (20 MHz) | ** 30** | Person-in-WiFi, Widar 3.0, Wi-Pose, XRF55 |
39+ | Nexmon/Broadcom (80 MHz) | ** 242-256** | None with pose labels |
40+
41+ MM-Fi uses the same Atheros hardware family at 40 MHz, making 114→56 interpolation
42+ physically meaningful (same chipset, different channel width).
43+
2844## Decision
2945
30- Use MM-Fi as the primary training dataset, supplemented by XRF55 for additional
31- diversity, with a teacher-student pipeline for any dataset that lacks dense pose
32- annotations but provides RGB video.
46+ Use MM-Fi as the primary training dataset, supplemented by Wi-Pose (NjtechCVLab)
47+ for additional diversity. XRF55 is downgraded to optional (Kinect labels need
48+ post-processing). Teacher-student pipeline fills in DensePose UV labels where
49+ only skeleton keypoints are available.
3350
3451### Primary Dataset: MM-Fi
3552
3653** Paper:** "MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless
37- Sensing" (NeurIPS 2023 Datasets Track )
38- ** Repository:** https://github.com/ybCliff/MM-Fi
39- ** Size:** 40 volunteers × 27 action classes × ~ 320,000 frames
54+ Sensing" (NeurIPS 2023 Datasets & Benchmarks )
55+ ** Repository:** https://github.com/ybhbingo/MMFi_dataset
56+ ** Size:** 40 subjects × 27 action classes × ~ 320,000 frames, 4 environments
4057** Modalities:** WiFi CSI, mmWave radar, LiDAR, RGB-D, IMU
41- ** CSI format:** 3 Tx × 3 Rx antennas, 114 subcarriers, 100 Hz sampling rate,
42- IEEE 802.11n 5 GHz, raw amplitude + phase
43- ** Pose annotations:** 17-keypoint COCO skeleton (from RGB-D ground truth)
58+ ** CSI format:** ** 1 TX × 3 RX antennas** , 114 subcarriers, 100 Hz sampling rate,
59+ 5 GHz 40 MHz (TP-Link N750 with Atheros CSI Tool), raw amplitude + phase
60+ ** Data tensor:** [ 3, 114, 10] per sample (antenna-pairs × subcarriers × time frames)
61+ ** Pose annotations:** 17-keypoint COCO skeleton in 3D + DensePose UV surface coords
4462** License:** CC BY-NC 4.0
45- ** Why primary:** Largest public WiFi CSI + pose dataset; raw amplitude and phase
46- available (not just processed features); antenna count (3×3) is compatible with the
47- existing ` CSIProcessor ` configuration; COCO keypoints map directly to the
48- ` KeypointHead ` output format.
49-
50- ### Secondary Dataset: XRF55
51-
52- ** Paper:** "XRF55: A Radio-Frequency Dataset for Human Indoor Action Recognition"
53- (ACM MM 2023)
54- ** Repository:** https://github.com/aiotgroup/XRF55
55- ** Size:** 55 action classes, multiple subjects and environments
56- ** CSI format:** WiFi CSI + UWB radar, 3 Tx × 3 Rx, 30 subcarriers
57- ** Pose annotations:** Skeleton keypoints from Kinect
63+ ** Why primary:** Largest public WiFi CSI + pose dataset; richest annotations (3D
64+ keypoints + DensePose UV); same Atheros hardware family as target system; COCO
65+ keypoints map directly to the ` KeypointHead ` output format; actively maintained
66+ with NeurIPS 2023 benchmark status.
67+
68+ ** Antenna correction:** MM-Fi uses 1 TX / 3 RX (3 antenna pairs), not 3×3.
69+ The existing system targets 3×3 (ESP32 mesh). The 3 RX antennas match; the TX
70+ difference means MM-Fi-trained weights will work but may benefit from fine-tuning
71+ on data from a 3-TX setup.
72+
73+ ### Secondary Dataset: Wi-Pose (NjtechCVLab)
74+
75+ ** Paper:** CSI-Former (MDPI Entropy 2023) and related works
76+ ** Repository:** https://github.com/NjtechCVLab/Wi-PoseDataset
77+ ** Size:** 12 volunteers × 12 action classes × 166,600 packets
78+ ** CSI format:** 3 TX × 3 RX antennas, 30 subcarriers, 5 GHz, .mat format
79+ ** Pose annotations:** 18-keypoint AlphaPose skeleton (COCO-compatible subset)
5880** License:** Research use
59- ** Why secondary:** Different environments and action vocabulary increase
60- generalization; 30 subcarriers requires subcarrier interpolation to match the
61- existing 56-subcarrier config.
81+ ** Why secondary:** 3×3 antenna array matches target ESP32 mesh hardware exactly;
82+ fully public; adds 12 different subjects and environments not in MM-Fi.
83+ ** Note:** 30 subcarriers require zero-padding or interpolation to 56; 18→17
84+ keypoint mapping drops one neck keypoint (index 1), compatible with COCO-17.
85+
86+ ### Excluded / Deprioritized Datasets
87+
88+ | Dataset | Reason |
89+ | ---------| --------|
90+ | RF-Pose / RF-Pose3D (MIT) | Custom FMCW radio, not 802.11n CSI; incompatible signal physics |
91+ | Person-in-WiFi (CMU 2019) | Not publicly released (IRB restriction) |
92+ | Person-in-WiFi 3D (CVPR 2024) | 30 subcarriers, Intel 5300; semi-public access |
93+ | DensePose From WiFi (CMU) | Dataset not released; only paper + architecture |
94+ | Widar 3.0 | Gesture labels only, no full-body pose keypoints |
95+ | XRF55 | Activity labels primarily; Kinect pose requires email request; lower priority |
96+ | UT-HAR, WiAR, SignFi | Activity/gesture labels only, no pose keypoints |
6297
63- ### Excluded Datasets and Reasons
98+ ## Implementation Plan
6499
65- | Dataset | Reason for exclusion |
66- | ---------| ---------------------|
67- | RF-Pose / RF-Pose3D (MIT) | Uses 60 GHz mmWave, not 2.4/5 GHz WiFi CSI; incompatible signal physics |
68- | Person-in-WiFi (CMU 2019) | Amplitude only, no phase; not publicly released |
69- | Widar 3.0 | Gesture recognition only, no full-body pose |
70- | NTU-Fi | Activity labels only, no pose keypoints |
71- | WiPose | Limited release; superseded by MM-Fi |
100+ ### Phase 1: MM-Fi Loader (Rust ` wifi-densepose-train ` crate)
72101
73- ## Implementation Plan
102+ Implement ` MmFiDataset ` in Rust (` crates/wifi-densepose-train/src/dataset.rs ` ):
103+ - Reads MM-Fi numpy .npy files: amplitude [ N, 3, 3, 114] (antenna-pairs laid flat), phase [ N, 3, 3, 114]
104+ - Resamples from 114 → 56 subcarriers (linear interpolation via ` subcarrier.rs ` )
105+ - Applies phase sanitization using SOTA algorithms from ` wifi-densepose-signal ` crate
106+ - Returns typed ` CsiSample ` structs with amplitude, phase, keypoints, visibility
107+ - Validation split: subjects 33–40 held out
74108
75- ### Phase 1: MM-Fi Loader
109+ ### Phase 2: Wi-Pose Loader
76110
77- Implement a ` PyTorch Dataset ` class that:
78- - Reads MM-Fi's HDF5/numpy CSI files
79- - Resamples from 114 subcarriers → 56 subcarriers (linear interpolation along
80- frequency axis) to match the existing ` CSIProcessor ` config
81- - Normalizes amplitude and unwraps phase using the existing ` PhaseSanitizer `
82- - Returns ` (amplitude, phase, keypoints_17) ` tuples
111+ Implement ` WiPoseDataset ` reading .mat files (via ndarray-based MATLAB reader or
112+ pre-converted .npy). Subcarrier interpolation: 30 → 56 (zero-pad high frequencies
113+ rather than interpolate, since 30-sub Intel data has different spectral occupancy
114+ than 56-sub Atheros data).
83115
84- ### Phase 2 : Teacher-Student Labels
116+ ### Phase 3 : Teacher-Student DensePose Labels
85117
86- For samples where only skeleton keypoints are available (not full DensePose UV maps):
87- - Run Detectron2 DensePose on the paired RGB frames to generate `(part_labels,
88- u_coords, v_coords)` pseudo-labels
89- - Cache generated labels to avoid recomputation during training epochs
90- - This matches the training procedure in the original CMU paper
118+ For MM-Fi samples that provide 3D keypoints but not full DensePose UV maps:
119+ - Run Detectron2 DensePose on paired RGB frames to generate ` (part_labels, u_coords, v_coords) `
120+ - Cache generated labels as .npy alongside original data
121+ - This matches the training procedure in the CMU paper exactly
91122
92- ### Phase 3 : Training Pipeline
123+ ### Phase 4 : Training Pipeline (Rust)
93124
94- - ** Loss :** Combined keypoint heatmap loss (MSE) + DensePose part classification
95- ( cross-entropy) + UV regression (Smooth L1) + transfer loss against teacher
96- RGB backbone features
97- - ** Optimizer:** Adam, lr=1e-3, milestones at 48k and 96k steps (paper schedule)
125+ - ** Model :** ` WiFiDensePoseModel ` (tch-rs, ` crates/wifi-densepose-train/src/model.rs ` )
126+ - ** Loss: ** Keypoint heatmap (MSE) + DensePose part ( cross-entropy) + UV (Smooth L1) + transfer (MSE)
127+ - ** Metrics: ** [email protected] + OKS with Hungarian min-cost assignment ( ` crates/wifi-densepose-train/src/metrics.rs ` ) 128+ - ** Optimizer:** Adam, lr=1e-3, step decay at epochs 40 and 80
98129- ** Hardware:** Single GPU (RTX 3090 or A100); MM-Fi fits in ~ 50 GB disk
99130- ** Checkpointing:** Save every epoch; keep best-by-validation-PCK
100131
101- ### Phase 4: Evaluation
132+ ### Phase 5: Proof Verification
102133
103- - ** Keypoints:
** [email protected] (Percentage of Correct Keypoints within 20% of torso size)
104- - ** DensePose:** GPS (Geodesic Point Similarity) and GPSM with segmentation mask
105- - ** Held-out split:** MM-Fi subjects 33-40 (20%) for validation; no test-set leakage
134+ ` verify-training ` binary provides the "trust kill switch" for training:
135+ - Fixed seed (MODEL_SEED=0, PROOF_SEED=42)
136+ - 50 training steps on deterministic SyntheticDataset
137+ - Verifies: loss decreases + SHA-256 of final weights matches stored hash
138+ - EXIT 0 = PASS, EXIT 1 = FAIL, EXIT 2 = SKIP (no stored hash)
106139
107140## Subcarrier Mismatch: MM-Fi (114) vs System (56)
108141
109- MM-Fi captures 114 subcarriers at 5 GHz with 40 MHz bandwidth. The existing system
110- is configured for 56 subcarriers. Resolution options in order of preference :
142+ MM-Fi captures 114 subcarriers at 5 GHz with 40 MHz bandwidth (Atheros CSI Tool).
143+ The system is configured for 56 subcarriers (Atheros, 20 MHz). Resolution options :
111144
112- 1 . ** Interpolate MM-Fi → 56** (recommended for initial training): linear interpolation
113- preserves spectral envelope, fast, no architecture change needed
114- 2 . ** Reconfigure system → 114** : change ` CSIProcessor ` config; requires re-running
115- ` verify.py --generate-hash ` to update proof hash
116- 3 . ** Train at native 114, serve at 56** : separate train/inference configs; adds
117- complexity
145+ 1 . ** Interpolate MM-Fi → 56** (chosen for Phase 1): linear interpolation preserves
146+ spectral envelope, fast, no architecture change needed
147+ 2 . ** Train at native 114** : change ` CSIProcessor ` config; requires re-running
148+ ` verify.py --generate-hash ` to update proof hash; future option
149+ 3 . ** Collect native 56-sub data** : ESP32 mesh at 20 MHz; best for production
118150
119- Option 1 is chosen for Phase 1 to unblock training immediately.
151+ Option 1 unblocks training immediately. The Rust ` subcarrier.rs ` module handles
152+ interpolation as a first-class operation with tests proving correctness.
120153
121154## Consequences
122155
123156** Positive:**
124- - Unblocks end-to-end training without hardware collection
125- - MM-Fi's 3×3 antenna setup matches this system's target hardware (ESP32 mesh, ADR-012)
126- - 40 subjects with 27 action classes provides reasonable diversity for a first model
157+ - Unblocks end-to-end training on real public data immediately
158+ - MM-Fi's Atheros hardware family matches target system (same CSI Tool)
159+ - 40 subjects × 27 actions provides reasonable diversity for first model
160+ - Wi-Pose's 3×3 antenna setup is an exact hardware match for ESP32 mesh
127161- CC BY-NC license is compatible with research and internal use
162+ - Rust implementation integrates natively with ` wifi-densepose-signal ` pipeline
128163
129164** Negative:**
130165- CC BY-NC prohibits commercial deployment of weights trained solely on MM-Fi;
131166 custom data collection required before commercial release
132- - 114→56 subcarrier interpolation loses some frequency resolution; acceptable for
133- initial training, revisit in Phase 2
134- - MM-Fi was captured in controlled lab environments; expect accuracy drop in
135- complex real-world deployments until fine-tuned on domain-specific data
167+ - MM-Fi is 1 TX / 3 RX; system targets 3 TX / 3 RX; fine-tuning needed
168+ - 114→56 subcarrier interpolation loses frequency resolution; acceptable for v1
169+ - MM-Fi captured in controlled lab environments; real-world accuracy will be lower
170+ until fine-tuned on domain-specific data
136171
137172## References
138173
139- - He et al., "MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset" (NeurIPS 2023)
140- - Yang et al., "DensePose From WiFi" (arXiv 2301.00250, CMU 2023)
174+ - Yang et al., "MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset" (NeurIPS 2023) — arXiv:2305.10345
175+ - Geng et al., "DensePose From WiFi" (CMU, arXiv:2301.00250, 2023)
176+ - Yan et al., "Person-in-WiFi 3D" (CVPR 2024)
177+ - NjtechCVLab, "Wi-Pose Dataset" — github.com/NjtechCVLab/Wi-PoseDataset
141178- ADR-012: ESP32 CSI Sensor Mesh (hardware target)
142179- ADR-013: Feature-Level Sensing on Commodity Gear
143180- ADR-014: SOTA Signal Processing Algorithms
0 commit comments