Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Surface unreachable bulbs as "Not Responding" via HAP#173

Merged
kpsuperplane merged 2 commits into
kpsuperplane:masterfrom
dhananjaysathe:offline-detection
Apr 27, 2026
Merged

Surface unreachable bulbs as "Not Responding" via HAP#173
kpsuperplane merged 2 commits into
kpsuperplane:masterfrom
dhananjaysathe:offline-detection

Conversation

@dhananjaysathe
Copy link
Copy Markdown
Contributor

Summary

Adds an opt-in reportOffline flag (default: false) that marks bulbs as "Not Responding" in HomeKit once they miss offlineThreshold consecutive getPilot replies, rather than silently replaying the last known cached state. Reachability is cleared on any successful getPilot reply or any registration / getSystemConfig response from the same MAC, so bulbs that come back online recover immediately.

Leaves existing behaviour unchanged when the flag is left at its default.

Background

Today, when a bulb stops responding to UDP getPilot requests, src/accessories/WizLight/pilot.ts:152-158 (and the same pattern in WizSocket/pilot.ts:72-78) returns the cached pilot:

const timeout = setTimeout(() => {
  if (device.mac in cachedPilot) {
    onDone(null, cachedPilot[device.mac]);   // replays stale state
  } else {
    onDone(new Error("No response within 1s"), undefined as any);
  }
}, 1000);

The original rationale (#48, 2021) was that HAP couldn't surface "Not Responding" at the time — but HomeKit does now honour an Error returned from a characteristic get callback by showing the tile as Not Responding, which is what users expect. The error plumbing that #41 already threaded through util/network.ts makes surfacing this trivial.

What this changes

  • New src/util/reachability.ts — tiny pure module tracking consecutive miss counts per MAC. recordHit / recordMiss / isOffline(mac, threshold).
  • WizLight/pilot.ts + WizSocket/pilot.ts: on getPilot timeout, record a miss. If reportOffline is on and the miss count hits offlineThreshold, surface the error up the existing onError callback instead of returning cached state. On any successful reply, clear the miss count.
  • util/network.ts: on any incoming registration or getSystemConfig reply, clear the miss count for that MAC (bulb is demonstrably alive).
  • wiz.ts::initRefreshInterval: when refreshInterval > 0, also re-broadcast discovery on each tick so bulbs returning to the network (or acquiring a new DHCP lease) re-announce themselves — otherwise device.ip can stay stale indefinitely. Refresh-loop failures log at debug instead of error to avoid spamming the log once per tick per offline bulb (matches the convention established by Change log level of pings to debug #141).
  • config.schema.json + README.md: new reportOffline: boolean (default false) and offlineThreshold: integer (default 3).

Pairs naturally with the refreshInterval polling added in #128 — background pings detect unreachable bulbs without waiting for HomeKit to read them.

Log hygiene

Because discovery now re-broadcasts every tick, sendDiscoveryBroadcast was also downgraded to debug. Similarly, tryAddDevice's "Updating accessory:" line now logs at info only when the display name actually changed; otherwise debug. Without these, a typical 7-bulb / 30s-refresh setup would produce ~22 info-level lines per minute from our changes alone. Same rationale as #141.

Bundled defensive fixes in the same code region

Two narrow undefined-handling fixes in pilot.ts that live in the exact lines this PR touches and address recurring issues:

Issues addressed

Primary (silent-stale-state pattern): #135, #140, #105, #87, #97, #160, #167

Bundled (NaN / undefined-temp crashes in the same code region): #96, #101, #143, #145, #159

Compatibility

  • Default behaviour is unchanged — the flag is opt-in.
  • No schema changes that affect existing config.
  • No new dependencies.
  • Error plumbing reuses the signature introduced by Improve network response #41.
  • Diff: +139 / -10 across 8 files.

Test plan

  • npm run build clean (no TS errors on Node 25).
  • Deployed to a Homebridge container with 6 real Wiz bulbs (4 online, 2 on a cut-power wall switch). reportOffline: true, refreshInterval: 30.
  • tcpdump across a 3-minute window: online bulbs received broadcasts every 30s and replied every 30s; offline bulbs received broadcasts but never replied.
  • Online bulbs remain reachable in HomeKit with no log output; offline bulbs surface as "Not Responding" within offlineThreshold × refreshInterval (default 90s); powering them back on restores state on the next refresh tick.
  • With reportOffline: false (default), behaviour is byte-for-byte the same as before — confirmed by diffing log output against pre-PR build.

🤖 Generated with Claude Code

dhananjaysathe and others added 2 commits April 17, 2026 16:03
Adds an opt-in `reportOffline` flag (default: false) that marks bulbs as
"Not Responding" in HomeKit once they miss `offlineThreshold` consecutive
getPilot replies, rather than silently replaying the last known cached
state. Reachability is cleared on any successful getPilot reply or any
registration / getSystemConfig response from the same MAC, so bulbs that
come back online recover immediately.

Default behaviour is preserved when the flag is left off.

When refreshInterval > 0, the refresh loop now also re-broadcasts discovery
on each tick so bulbs returning to the network (or acquiring a new DHCP
lease) are re-learned automatically. Refresh-loop failures log at debug
instead of error to match the convention established by kpsuperplane#141 and avoid
spamming the log every tick for a single offline bulb.

Also bundles two narrow defensive fixes in the same code region:

- Filter `undefined` fields before merging into `cachedPilot`, so firmware
  replies that omit `dimming` (e.g. sceneId 14 "nightlight") don't produce
  NaN Brightness in HomeKit — addresses kpsuperplane#96 / kpsuperplane#101 / kpsuperplane#143 / kpsuperplane#159.

- `pilotToColor()` falls back to a neutral-white reading when passed an
  empty cache entry, so temperature / color set handlers invoked after a
  timeout wiped the cache don't crash with "Cannot read properties of
  undefined (reading 'temp')" — addresses kpsuperplane#145.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
With refreshInterval > 0 the refresh loop now re-broadcasts discovery on
each tick, which used to log one info-level line per listed device per
tick (7 lines / 30s in a typical setup). Two follow-ups to keep this
silent on the happy path:

- `sendDiscoveryBroadcast` logs its broadcasts at debug instead of info,
  matching the refresh-ping convention established by kpsuperplane#141.
- `tryAddDevice` now logs `Updating accessory:` at info only when the
  display name actually changed (rename path); otherwise it emits the
  message at debug. Before this, every already-registered device would
  log a line on every rediscovery even though nothing changed.

No behavioural change to the state-tracking itself — just log-level
hygiene so the periodic discovery doesn't become a log firehose.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@kpsuperplane kpsuperplane merged commit c6b0f69 into kpsuperplane:master Apr 27, 2026
@kpsuperplane
Copy link
Copy Markdown
Owner

Thank you!

kpsuperplane added a commit that referenced this pull request Apr 27, 2026
Documents the reportOffline feature and bundled pilot-state bug fixes
from dhananjaysathe's contribution.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants