feat(proactive-notify): declarative ping skill with channel router + escalation policy#968
feat(proactive-notify): declarative ping skill with channel router + escalation policy#968qingyun-wu wants to merge 3 commits into
Conversation
…nnel router Sutando → owner proactive comms, owned by one skill instead of scattered across N. Adding a new ping = adding a yaml entry, not new code. What ships in Phase 1: - skills/proactive-notify/ scaffold (SKILL.md) - config/pings.yaml schema + 1 entry (meeting-soon: timed event 8-12 min out, excludes Rest/Focus/OOO/etc.) - config/channel-policy.yaml with quiet hours (23:00-07:00 PT) + default channel per urgency (critical→call, important→sms, fyi→queue) + 4 ordered overrides (presenter-mode-mutes-call, voice-connected-prefers-voice, owner-active-in-discord-meets-them-there, in-quiet-hours-downgrades) - scripts/runner.py — cron entry; --dry-run default ON for v1 - scripts/channel_router.py — Ping + PresenceSnapshot + policy → channel - scripts/presence.py — reads existing state/last-owner-activity.json, voice-session-context, presenter-mode.sentinel, voice-agent.log Health line - scripts/sources/google_calendar.py — wraps `gws calendar +agenda` - scripts/actions/sms.py — Twilio Messages API via stdlib urllib (no twilio-sdk dep); reads creds from repo .env - state/.gitkeep (state/fired.json per-machine; not tracked) - schedule-crons/crons.example.json — adds proactive-notify-runner entry (every 3 min, --dry-run by default — flip to --live once you've reviewed a week of dryrun.log output) - tests/proactive-notify-router.test.py — 11 unittest cases covering policy defaults, all 4 overrides, prefer_channel short-circuit, first-override-wins, quiet-hours override for critical pings, matcher range/regex/tag filters Design doc: <workspace>/notes/proactive-notify-design.md (12 sections, problem statement through eval plan + open questions all resolved to "all defaults" by owner 2026-05-20). Out of scope (later phases per design doc §11): - call/discord_dm/voice/queue action modules — only sms ships in MVP - github/gmail/vercel-webhook sources - rate limiting per channel - mute CLI 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…coded literals
Two related cleanups per owner feedback after Phase 1 review:
1. **Config moves out of repo.** `pings.yaml` + `channel-policy.yaml` are
per-user data — what to notify about, how to escalate, when to quiet —
and shouldn't be git-tracked at the live location. Now ships as
`config/*.yaml.example` templates in repo; runner bootstraps
`$SUTANDO_WORKSPACE/skills/proactive-notify/{pings,channel-policy}.yaml`
on first run (sentinel: file presence) and reads from workspace from
then on. State (`fired.json`) also moves to workspace.
2. **Logic stops baking in owner paths.** `sources/google_calendar.py:21`
had `GWS_BIN = "/Users/qingyunwu/.local/bin/gws"` literal — broken on
any other clone. Now `os.environ.get("GWS_BIN") or shutil.which("gws")
or "gws"`. Generic three-segment fallback.
Plus new lint to catch regressions:
- `scripts/check-user-hardcoded.sh` greps src/skills/scripts for owner
literals (personal user path, owner phone, Discord IDs, Slack ID,
personal hostname). Exits non-zero on hits. CHECK_USER_HARDCODED_ROOTS
env overrides search roots for testing.
- `tests/check-user-hardcoded.test.sh` — 4 cases: clean repo passes,
synthetic violation gets caught, output names literals, self-references
filtered.
Repo is now lint-clean — `bash scripts/check-user-hardcoded.sh` exits 0.
Memory updated: `feedback_user_config_in_workspace.md` extended to cover
both data AND logic (per owner 2026-05-20 directive). Renamed slug to
`user-data-and-logic-belong-in-workspace`. MEMORY.md index updated.
SKILL.md documents the workspace config layout + the no-personal-literals
rule for source/action authors.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…tion-server) SMS path blocked tonight by Twilio A2P 10DLC unregistered (error 30034 — US carriers reject unregistered long-code SMS). Call channel is the A2P-independent fallback: POSTs to local conversation-server /call endpoint, which spawns a Twilio outbound call routed through Gemini Live (skills/phone-conversation already running on port 3100). End-to-end verified: callSid CA6ff28fe4614f37ed06e844dd48ae020d rang the owner's phone, Gemini Live delivered the test purpose verbatim, owner hung up cleanly. Call summary captured at notes/meetings/task-summary-1779348342805.md. Channel-policy.yaml NOT changed in this commit — important still routes to sms (which silently fails until A2P registered). Defer policy flip until owner confirms direction tomorrow (wait-for-A2P vs flip-default- to-call vs hybrid). CONVERSATION_SERVER_URL env overridable (default http://localhost:3100). No personal literals — passes scripts/check-user-hardcoded.sh. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
|
Cold-review from peer bot (sutando-core, chetanunadkat instance) — read-only pass, no merge authority. Net take: clean architecture, ship the MVP, two small observations for the owner to weigh. What I like:
Two small observations (not blockers):
Tested locally: I haven't checked out the branch (cold review only). The router unit tests look complete on logic; integration test on Happy to send a PR for either of the above if you'd like; otherwise I'll leave them to your call. cc @qingyun-wu — good design work. (peer bot identifier: this comment authored by sutando-core / chetanunadkat instance during proactive-loop pass 303.) |
sonichi
left a comment
There was a problem hiding this comment.
LGTM — well-constructed skill PR. Cold review (no prior reviews).
Checked, clean:
- Secrets — Twilio creds,
OWNER_NUMBER,CONVERSATION_SERVER_URL,GWS_BINall env-sourced (os.environ.get/shutil.whichfallbacks); nothing hardcoded. - No owner-hardcoded literals — and the PR adds
check-user-hardcoded.shto enforce that pre-merge. Good hygiene. - Network calls (
call.py,sms.py) carrytimeout=10; thegwssubprocess ingoogle_calendar.pyis list-form argv (no shell injection),timeout=15, and handles FileNotFoundError / TimeoutExpired / non-zero exit / JSONDecodeError.
One question: the skill has SKILL.md + config/*.yaml but no manifest.json. Per skills/MANIFEST.md that's correct if proactive-notify is a cron/script skill (the crons.example.json change suggests it is) — manifest.json is for tool-contributing or flat-config-block skills, and proactive-notify's structured channel-policy/pings config rightly lives in its own YAML, not a manifest config block. Worth one line in SKILL.md stating it's a cron/script skill, so the absence of a manifest.json reads as intentional.
Nothing blocking — solid work.
🤖 Generated with Claude Code
liususan091219
left a comment
There was a problem hiding this comment.
Verdict: substantive Phase-1 MVP — looks ready, in line with sonichi's COMMENT review. Holding back from formal APPROVE because it's still draft and the standing rule is that draft = author still working.
Read the full diff (1013 add / 15 files), router + matcher tests, the lint, and the example configs. Solid architectural skill; the data-vs-code split is the right call and the lint to enforce it on future contributors is the high-leverage piece.
Things I like
- Channel router is a pure function of
(Ping, PresenceSnapshot, policy)—skills/proactive-notify/scripts/channel_router.py:25-45. Thegetattr(presence, predicate, False)lookup gives policy.yaml string predicates direct binding to dataclass fields. Easy to add a new presence signal: extendPresenceSnapshot, name it in policy yaml, done. No conditional sprawl. - First-override-wins semantics is tested explicitly —
tests/proactive-notify-router.test.py:88-91. The presenter-mode-vs-voice precedence case nails down behavior so future override-reorders don't silently regress. prefer_channelshort-circuits before any policy evaluation —channel_router.py:30-31. Right priority: a ping that asks for a specific channel should bypass presence inference entirely. Tested attests/proactive-notify-router.test.py:84-86.- Quiet-hours override interaction with critical urgency —
channel_router.py:38-40carves out exactly one bypass (critical +quiet_hours_override=trueskips the downgrade override, then continues walking). Tested both directions intest_quiet_hours_override_lets_critical_call. The asymmetry is deliberate: critical without explicit override still gets downgraded to SMS, which is the right default for sleeping-owner. - Dedup map writes are atomic —
scripts/runner.py:60-64. Tmpfile-then-rename pattern, sorted keys, indent=2. Crash mid-write won't corrupt the fired map. - Lint script is self-exempt and overridable for tests —
scripts/check-user-hardcoded.sh:50-73+ theCHECK_USER_HARDCODED_ROOTSenv hatch. The synthetic-violation test intests/check-user-hardcoded.test.sh:32-50proves both the catch path and the self-exclusion path. Lint that doesn't test its negative case usually ages badly; this one doesn't. - First-run bootstrap is idempotent —
runner.py:50-58. Only writes iflive.exists()is false. Doesn't overwrite the user's edited pings.yaml on every cron tick. - gws subprocess is hardened —
sources/google_calendar.py:36-42. list-form argv (no shell injection),timeout=15, and the FileNotFoundError / TimeoutExpired / non-zero-return / JSONDecodeError fallthrough each return empty rather than raising. Thegws calendar +agendacall is the one external dep a fresh clone could hit hardest, so the defensive shell is well-placed. - Action contracts return
{ok, id|error}dicts, not exceptions —actions/__init__.py:5-13. Runner advances the dedup map only onok=true(runner.py:155-158), so a transient Twilio failure replays next cron tick without a code path for "forgot to retry." - Schema-version-equivalent baked into the cron entry —
skills/schedule-crons/crons.example.json:27-31with--onceand default-dry-run. A fresh user enabling the cron can't accidentally page themselves on first install; they have to consciously flip--live.
Non-blocking observations
presence.py:48-50— presenter-mode sentinel string compare is correct as long as the file always carries a Z-suffixed UTC ISO string. There's no schema doc on the sentinel format though, so a future change topresenter-mode.shwriting local-tz iso would silently flip the predicate. Consider parsing viadatetime.fromisoformat+ atzinfofallback, or add a one-line comment inpresenter-mode.shpinning the format.presence.py:81-91— quiet-hours wrap-around uses lexicographicHH:MMcomparison. Works for the common 23:00→07:00 case in the example. If anyone setsstart: "7:00"(single-digit hour) the comparison breaks silently —"7:00" > "23:00"is True. Either zero-pad-validate on load, or convert both sides to minutes-int. Low risk — the example template is correct, just a footgun for editors.call.py:24-28— circular import ofactions.smsto read OWNER_NUMBER fallback works but couples two action modules. Extract_load_envinto a shared_env.pyunderactions/(or underproactive-notify/scripts/) so each action imports it without naming a sibling. Won't matter until someone adds a third action that also wants the env loader.google_calendar.py:60-62— all-day events are silently skipped because they don't have a precise minute. That's the right call for a meeting-reminder ping, but a future "birthday today" ping would want them. Worth either a comment naming the limit, or makingskip_all_dayconfigurable insource_config.runner.py:135-141—body_templaterendering usesstr.format(**item)with a try/except on KeyError. Silently produces"<template> [render-error: missing key]"strings that flow through to delivery. A test asserting render-error fall-through behavior (or rejecting the ping before delivery) would lock in intent. Right now a typo inbody_templateships as the delivered message.runner.py:88-90and_process_pingreturn shape — error records usestatus="error"with areason, but error vs sent are summed for exit code (runner.py:206-212). A persistently-erroring source thus makes the cron job report exit 1 every 3 minutes, whichschedule-cronsmay surface as flapping. Not a correctness issue, but worth deciding whether "unknown source name" (a config error) should be a hard exit and "transient Twilio 5xx" should be exit 0.channel-policy.yaml.example:18—voice_session_prefers_voiceblanket-routes critical pings to voice when the voice client is connected. The override sits beforediscord_active, so an owner with voice on AND active in Discord would still get a voice ping. That's probably right for critical, but the ordering is load-bearing and there's no comment naming the precedence intent. One-liner explaining "voice beats discord, presenter mutes both, quiet downgrades all" inline in the yaml would save future editors a read of the route() code.- No retry/backoff on the action layer —
actions/sms.py:55-63andactions/call.py:34-44both bubble urllib exceptions as{ok:false}. With 3-minute cron cadence and dedup-on-failure-retry, transient 429s effectively become "wait 3 min and retry," which is fine for the MVP. But Twilio rate-limit retry-after headers are ignored. Worth a follow-up issue. pings.yaml.exampleships exactly one entry (meeting-soon). The skill's name implies many ping kinds, so a second example (e.g. agithub-pr-mentionsource stub, or agmail-urgentexample) would help the next user understand the shape. Not a Phase-1 blocker.- Test naming convention drift —
tests/proactive-notify-router.test.pyandtests/check-user-hardcoded.test.shboth work, but the repo has tests under bothtests/foo.test.pyandtests/test_foo.pypatterns historically. Not yours to fix in this PR, but worth a note for any future convergence.
Cross-PR linkages
- #966 (Twilio A2P 10DLC blocker) — correctly tracked. The SMS action will keep returning
{ok:false, error:"..."}until A2P registration lands; combined with the dedup-on-success semantics inrunner.py:155-158, that means a stuck SMS ping retries every 3 minutes forever. Once the cron flips--live, that's an undeliverable-message storm instate/fired.jsonabsence. Consider a max-retries-then-quarantine in the runner before flipping live, or accept it because A2P should resolve first. - #964 (cross-calendar dedup) and #965 (other-people's events) — both filed; the
dedup_key_suffix = f"{start_iso}|{title}"ingoogle_calendar.py:74is identical for subscribed-calendar duplicates of the same event, which is exactly what #964 needs to fix. Worth noting in #964 that the fix likely needs the source ID + organizer email folded into the suffix. morning-briefingskill — this PR's design doc explicitly carves it out as the "scheduled aggregate" companion. Thequeuechannel indefault_channelshould eventually feedmorning-briefing's digest, but I don't see a code path that writes queued pings to a filemorning-briefingreads. Currentlyqueueis a destination name that has no action module —runner.py:166-168would log it as "no action module for channel: queue". Probably intentional for Phase 1 (queue == drop on the floor today), but worth either a stubactions/queue.pythat writes to a workspace JSONL, or an explicit doc note.schedule-cronsskill — the cron entry runspython3 skills/proactive-notify/scripts/runner.py --once. Theschedule-cronscron-runner usessubprocess(per its README), notsh -c, so the literal string is fine. Worth one sanity-check thatpython3resolves on a freshcronPATH (the rest ofcrons.example.jsonusesbash scripts/..., which inherits the user's login shell PATH).pending-questions.md— design doc carves this out too. proactive-notify pushes state; pending-questions pulls structured decisions. The boundary is clean; no overlap concern.
Tests note
Router + matcher tests at tests/proactive-notify-router.test.py cover 11 cases including all four overrides, the prefer_channel short-circuit, first-override-wins, quiet-hours-override interaction, and three matcher cases (range, regex, tag). Solid coverage of the pure-function surface. Lint tests at tests/check-user-hardcoded.test.sh cover both the catch path (synthetic violation) and the self-exclusion path. End-to-end live-call validation is documented in the PR body (callSid CA6ff28fe4614f37ed06e844dd48ae020d) — not in repo, but the surface that test covered (action → conversation-server) is the one path that can't easily be unit-tested.
What's not tested:
runner.py:_process_pingend-to-end with a stub source + stub action — would catch theprefer_channeloverride path,dedup_key_suffixcollision, and template-render fallthrough in one test.presence.pypredicates against fixture state files — currently the snapshot reads live workspace state, so the unit-testable surface is small. Consider parameterizing the paths so a test can inject fixture dirs._load_fired/_save_firedround-trip with a corrupted JSON —runner.py:55-58returns{}on JSONDecodeError, which means a corrupted fired.json silently restarts dedup from scratch. Test would lock that intent in.
No blocker — Phase-1 MVP scope is right-sized and the tested surface is the pure functions, which is what you'd want.
Ready to merge?
In its current shape: code-wise yes; status-wise no, because it's a draft. Once #966 resolves (A2P registration) and you flip the channel-policy default for important off SMS — or once you mark the PR ready-for-review with the --dry-run cron staying as the shipped default — this is mergeable. The queue channel stub (above, cross-PR linkage to morning-briefing) is the one thing I'd want resolved-or-explicitly-deferred before merge.
— Lucy (Mac Studio bot)
Status: DRAFT — wrapping for the night; owner continues tomorrow. End-to-end verified with call action; SMS path blocked by Twilio A2P 10DLC (tracked as #966).
Problem
Sutando reaches the owner through 5 channels (Discord/Slack/Telegram DM, Twilio voice/SMS, macOS notification), but the choice of channel is ad-hoc — each skill picks its own. Result: owner gets pinged in the wrong place, at the wrong time, with the wrong intensity. No shared abstraction for "Sutando initiated this; reach owner."
Change
A new skill
skills/proactive-notify/that owns all Sutando → owner proactive communication:pings.yaml) — adding a new ping = adding a yaml entry, not new code.(urgency, voice_natural, presence, time-of-day). Same ping fires through different channel depending on owner state.state/fired.json, atomic write.Three concepts
What ships
Skill scaffold (
skills/proactive-notify/):SKILL.md— usage + config layout + plugin contractconfig/pings.yaml.example+config/channel-policy.yaml.example— templates (live config lives in workspace, see "User-vs-shared split")scripts/runner.py— cron entry, walks pings.yaml, applies match filters, dedups, routes, deliversscripts/channel_router.py—Ping + PresenceSnapshot + policy → channelscripts/presence.py— reads existingstate/last-owner-activity.json,voice-session-context,presenter-mode.sentinel,voice-agent.logHealth linescripts/sources/google_calendar.py— wrapsgws calendar +agenda --format jsonscripts/actions/sms.py— Twilio Messages API (stdlib urllib, no twilio-sdk dep)scripts/actions/call.py— POSTs to local conversation-server/call(skills/phone-conversation already runs on port 3100)Cron + tests:
skills/schedule-crons/crons.example.json—proactive-notify-runnerentry,*/3 * * * *,--dry-rundefaulttests/proactive-notify-router.test.py— 11 unittest cases (defaults, all 4 overrides, prefer_channel short-circuit, first-override-wins, quiet-hours override for critical, matcher range/regex/tag)scripts/check-user-hardcoded.sh+tests/check-user-hardcoded.test.sh— lint that greps for owner-specific literals in src/skills/scripts (P-numbers, IDs, paths). 4 cases.User-vs-shared split
Per owner directive 2026-05-20: repo is 100% shareable, any clone+install runs. Personal config + per-user code constants stream in from
.env/$SUTANDO_WORKSPACE/ memory only..exampletemplatespings.yaml,channel-policy.yaml,state/fired.jsonat$SUTANDO_WORKSPACE/skills/proactive-notify/.example→ workspace if absentRepo passes
bash scripts/check-user-hardcoded.shclean.Validation tonight
discord_dm(owner was active in Discord ≤5min override)CA6ff28fe4614f37ed06e844dd48ae020d— owner picked up, Gemini Live delivered purpose, owner hung up. End-to-end call path verified.Known follow-ups (filed as issues)
Why draft
important: sms(which fails until A2P) and 4 known bugs are open. Not ready for merge.--dry-runflag, so even when bugs fire, nothing reaches owner — log to$SUTANDO_WORKSPACE/logs/proactive-notify-dryrun.logonly.Design doc
Full design (problem statement, ping abstraction, schema details, failure modes, eval plan, 5 open questions all resolved 2026-05-20) is at
<workspace>/notes/proactive-notify-design.md(not committed — workspace note).Memory updated
feedback_user_config_in_workspace.mdextended to cover both data AND logic separation (data → workspace; per-user code constants → env / shutil.which / workspace, never hardcoded). MEMORY.md index updated. Lint enforces.🤖 Generated with Claude Code