Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit f995f69

Browse files
committed
docs: update ADRs with ENOMEM crash fix proof (Issue ruvnet#127)
- ADR-018: Document rate-limiting and ENOMEM backoff safeguards in firmware - ADR-029: Add note about rate-limiting requirement for channel hopping, mark lwIP pbuf exhaustion risk as resolved - ADR-039: Add finding ruvnet#5 documenting the sendto ENOMEM crash and fix (947 KB binary, hardware-verified 200+ callbacks with zero errors) - CHANGELOG: Add entries for Issue ruvnet#127 fix and Issue ruvnet#130 provisioning fix Co-Authored-By: claude-flow <[email protected]>
1 parent ce17169 commit f995f69

4 files changed

Lines changed: 14 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1414
- Training control: `GET /api/v1/train/status`, `POST /api/v1/train/start`, `POST /api/v1/train/stop`
1515
- Recording writes CSI frames to `.jsonl` files via tokio background task
1616
- Model/recording directories scanned at startup, state managed via `Arc<RwLock<AppStateInner>>`
17+
- **ADR-044: Provisioning tool enhancements** — 5-phase plan for complete NVS coverage (7 missing keys), JSON config files, mesh presets, read-back/verify, and auto-detect
1718
- **25 real mobile tests** replacing `it.todo()` placeholders — 205 assertions covering components, services, stores, hooks, screens, and utils
1819
- **Project MERIDIAN (ADR-027)** — Cross-environment domain generalization for WiFi pose estimation (1,858 lines, 72 tests)
1920
- `HardwareNormalizer` — Catmull-Rom cubic interpolation resamples any hardware CSI to canonical 56 subcarriers; z-score + phase sanitization
@@ -30,6 +31,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
3031
- ADR-025: macOS CoreWLAN WiFi Sensing (ORCA)
3132

3233
### Fixed
34+
- **sendto ENOMEM crash (Issue #127)** — CSI callbacks in promiscuous mode exhaust lwIP pbuf pool causing guru meditation crash. Fixed with 50 Hz rate limiter in `csi_collector.c` and 100 ms ENOMEM backoff in `stream_sender.c`. Hardware-verified on ESP32-S3 (200+ callbacks, zero crashes)
35+
- **Provisioning script missing TDM/edge flags (Issue #130)** — Added `--tdm-slot`, `--tdm-total`, `--edge-tier`, `--pres-thresh`, `--fall-thresh`, `--vital-win`, `--vital-int`, `--subk-count` to `provision.py`
3336
- **WebSocket "RECONNECTING" on Dashboard/Live Demo**`sensingService.start()` now called on app init in `app.js` so WebSocket connects immediately instead of waiting for Sensing tab visit
3437
- **Mobile WebSocket port**`ws.service.ts` `buildWsUrl()` uses same-origin port instead of hardcoded port 3001
3538
- **Mobile Jest config**`testPathIgnorePatterns` no longer silently ignores the entire test directory

docs/adr/ADR-018-esp32-dev-implementation.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,13 @@ static void csi_data_callback(void *ctx, wifi_csi_info_t *info) {
9696
9797
**No on-device FFT** (contradicting ADR-012's optional feature extraction path): The Rust aggregator will do feature extraction using the SOTA `wifi-densepose-signal` pipeline. Raw I/Q is cheaper to stream at ESP32 sampling rates (~100 Hz at 56 subcarriers = ~35 KB/s per node).
9898
99+
**Rate-limiting and ENOMEM backoff** (Issue #127 fix):
100+
101+
CSI callbacks fire 100-500+ times/sec in promiscuous mode. Two safeguards prevent lwIP pbuf exhaustion:
102+
103+
1. **50 Hz rate limiter** (`csi_collector.c`): `sendto()` is skipped if less than 20 ms have elapsed since the last successful send. Excess CSI callbacks are dropped silently.
104+
2. **ENOMEM backoff** (`stream_sender.c`): When `sendto()` returns `ENOMEM` (errno 12), all sends are suppressed for 100 ms to let lwIP reclaim packet buffers. Without this, rapid-fire failed sends cause a guru meditation crash.
105+
99106
**`sdkconfig.defaults`** must enable:
100107
101108
```

docs/adr/ADR-029-ruvsense-multistatic-sensing-mode.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,8 @@ static uint32_t s_dwell_ms = 50; // 50ms per channel
7474
7575
At 100 Hz raw CSI rate with 50 ms dwell across 3 channels, each channel yields ~33 frames/second. The existing ADR-018 binary frame format already carries `channel_freq_mhz` at offset 8, so no wire format change is needed.
7676
77+
> **Note (Issue #127 fix):** In promiscuous mode, CSI callbacks fire 100-500+ times/sec — far exceeding the channel dwell rate. The firmware now rate-limits UDP sends to 50 Hz and applies a 100 ms ENOMEM backoff if lwIP buffers are exhausted. This is essential for stable channel hopping under load.
78+
7779
**NDP frame injection:** `esp_wifi_80211_tx()` injects deterministic Null Data Packet frames (preamble-only, no payload, ~24 us airtime) at GPIO-triggered intervals. This is sensing-first: the primary RF emission purpose is CSI measurement, not data communication.
7880
7981
### 2.3 Multi-Band Frame Fusion
@@ -364,6 +366,7 @@ No new workspace dependencies. All ruvector crates are already in the workspace
364366
| Risk | Probability | Impact | Mitigation |
365367
|------|-------------|--------|------------|
366368
| ESP32 channel hop causes CSI gaps | Medium | Reduced effective rate | Measure gap duration; increase dwell if >5ms |
369+
| CSI callback rate exhausts lwIP pbufs | **Resolved** | Guru meditation crash | 50 Hz rate limiter + 100 ms ENOMEM backoff (Issue #127, PR #132) |
367370
| 5 GHz CSI unavailable on S3 | High | Lose frequency diversity | Fallback: 3-channel 2.4 GHz still provides 3x BW; ESP32-C6 for dual-band |
368371
| Model inference >40ms | Medium | Miss 20 Hz target | Run model at 10 Hz; Kalman predict at 20 Hz interpolates |
369372
| Two-person separation fails at 3 nodes | Low | Identity swaps | AETHER re-ID recovers; increase to 4-6 nodes |

docs/adr/ADR-039-esp32-edge-intelligence.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,3 +208,4 @@ Measured on ESP32-S3 (QFN56 rev v0.2, 8 MB flash, 160 MHz, ESP-IDF v5.2).
208208
2. **No PSRAM on test board** — WASM arena falls back to internal heap. Boards with PSRAM would support larger modules.
209209
3. **CSI rate exceeds spec** — measured 28.5 Hz vs. expected ~20 Hz. Performance headroom is better than estimated.
210210
4. **WiFi-to-Ethernet isolation** — some routers block UDP between WiFi and wired clients. Recommend same-subnet verification in deployment guide.
211+
5. **sendto ENOMEM crash (Issue #127)** — CSI callbacks in promiscuous mode fire 100-500+ times/sec, exhausting the lwIP pbuf pool and causing a guru meditation crash. Fixed with a dual approach: 50 Hz rate limiter in `csi_collector.c` (20 ms minimum send interval) and a 100 ms ENOMEM backoff in `stream_sender.c`. Binary size with fix: 947 KB. Hardware-verified stable for 200+ CSI callbacks with zero ENOMEM errors.

0 commit comments

Comments
 (0)