Thanks to visit codestin.com
Credit goes to github.com

Skip to content

fix(recovery): skip recovery for issues with scheduled future monitors#6558

Open
jelsco wants to merge 1 commit into
paperclipai:masterfrom
jelsco:feat/rr-1768-monitor-skip-recovery
Open

fix(recovery): skip recovery for issues with scheduled future monitors#6558
jelsco wants to merge 1 commit into
paperclipai:masterfrom
jelsco:feat/rr-1768-monitor-skip-recovery

Conversation

@jelsco
Copy link
Copy Markdown

@jelsco jelsco commented May 22, 2026

Summary

Fixes false-positive recovery loops on deliberately-parked issues that have a future monitorNextCheckAt (e.g., LinkedIn content calendar posts waiting for a publish-runner routine).

  • Add hasScheduledMonitor boolean to decideSuccessfulRunHandoff() — skips handoff when the issue has a scheduled future monitor
  • Select monitorNextCheckAt in handleSuccessfulRunHandoff() issue query and pass it to the decision function
  • Add monitorNextCheckAt > now guard in reconcileStrandedAssignedIssues() after the hasActiveExecutionPath check

Context

The issue-graph-liveness.ts system already had hasScheduledMonitor() that correctly skipped such issues, but two other recovery paths did not:

  1. Post-run handoff decision (handleSuccessfulRunHandoff)
  2. Periodic stranded-issue sweep (reconcileStrandedAssignedIssues)

Fixes RR-1768 (parent: RR-1763). Unblocks RR-1547, RR-1548, RR-1776, RR-1777 (LinkedIn posts stuck in recovery loops).

Test plan

  • Unit test: "skips when issue has a scheduled future monitor" in successful-run-handoff.test.ts
  • Integration test: "skips successful-run handoff when issue has a future monitorNextCheckAt" in heartbeat-process-recovery.test.ts
  • Integration test: "skips stranded sweep when issue has a future monitorNextCheckAt" in heartbeat-process-recovery.test.ts
  • All 15 unit tests pass
  • All 46 integration tests pass
  • TypeScript compiles clean (no new errors)

🤖 Generated with Claude Code

Both the post-run handoff decision (decideSuccessfulRunHandoff) and the
periodic stranded-issue sweep (reconcileStrandedAssignedIssues) were
classifying in_progress issues with succeeded runs as "missing disposition"
even when the issue had a future monitorNextCheckAt. This caused false-positive
recovery loops on deliberately-parked issues (e.g., LinkedIn content calendar
posts waiting for a publish-runner routine).

The issue-graph-liveness system already had hasScheduledMonitor() that
correctly skipped such issues, but these two other recovery paths did not.

Changes:
- Add hasScheduledMonitor boolean to decideSuccessfulRunHandoff input type
  and skip when true
- Select monitorNextCheckAt in handleSuccessfulRunHandoff issue query and
  pass it to the decision function
- Add monitorNextCheckAt > now check in reconcileStrandedAssignedIssues
  after the hasActiveExecutionPath guard
- Unit test for the new skip in successful-run-handoff.test.ts
- Two integration tests in heartbeat-process-recovery.test.ts covering
  both the handoff and sweep paths

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 22, 2026

Greptile Summary

This PR fixes false-positive recovery loops on "parked" issues that have a future monitorNextCheckAt timestamp (e.g., LinkedIn content-calendar posts waiting for a publish-runner routine) by adding a scheduled-monitor guard to two recovery paths that the existing issue-graph-liveness.ts check already covered.

  • Adds hasScheduledMonitor to decideSuccessfulRunHandoff() and selects monitorNextCheckAt in the associated issue query in heartbeat.ts, so post-run handoff correctly skips scheduled issues.
  • Inserts a monitorNextCheckAt > now early-exit in reconcileStrandedAssignedIssues() in service.ts, mirroring the guard already present in the liveness system.
  • Covers both paths with a new unit test and two new integration tests; monitorNextCheckAt was already present in the stranded-issue candidate query's select list, so no schema change is required.

Confidence Score: 4/5

Safe to merge once the PR description is updated to include the required Thinking Path, Risks, and Model Used sections per the project template.

The logic change is small and well-targeted: two new early-exit guards backed by unit and integration tests, with monitorNextCheckAt already present in the relevant query's select list. The only gap is the PR description, which omits the Thinking Path, Risks, and Model Used sections required by the project template.

No files require special attention for correctness; the PR description itself needs the missing template sections filled in before merge.

Important Files Changed

Filename Overview
server/src/services/recovery/successful-run-handoff.ts Adds hasScheduledMonitor boolean to decideSuccessfulRunHandoff input type and inserts a correctly-placed skip guard after the hasQueuedWake check.
server/src/services/heartbeat.ts Selects monitorNextCheckAt in the issue query for handleSuccessfulRunHandoff and derives hasScheduledMonitor inline using a safe optional-chaining pattern before passing it to the decision function.
server/src/services/recovery/service.ts Inserts a monitorNextCheckAt > now early-exit guard in reconcileStrandedAssignedIssues after hasActiveExecutionPath; monitorNextCheckAt was already included in the candidate query's select list.
server/src/services/recovery/successful-run-handoff.test.ts Unit test updated with hasScheduledMonitor: false default and a new case verifying that hasScheduledMonitor: true returns a skip result with the expected reason string.
server/src/tests/heartbeat-process-recovery.test.ts Two new integration tests cover the skip-on-future-monitor paths for both post-run handoff and the periodic stranded-issue sweep; fixtures set monitorNextCheckAt to 1 hour in the future before exercising the heartbeat service.
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
server/src/services/recovery/successful-run-handoff.ts:376
**PR template sections missing**

The PR description does not follow the required template from `.github/PULL_REQUEST_TEMPLATE.md`. The following required sections are absent or incomplete:

- **Thinking Path** — must trace reasoning from project context down to this specific change (5–8 blockquote steps, per `CONTRIBUTING.md`)
- **Risks** — no explicit risk assessment (e.g., what happens if a monitor timestamp is stale or the guard fires during a legitimate recovery window)
- **Model Used** — "Generated with Claude Code" in the footer is not the required section; the template asks for provider, exact model ID/version, context window, and reasoning mode

Per `CONTRIBUTING.md`, all PRs must use the template and fill out all sections before requesting merge.

Reviews (1): Last reviewed commit: "fix(recovery): skip recovery for issues ..." | Re-trigger Greptile

}
if (input.hasActiveExecutionPath) return { kind: "skip", reason: "issue already has an active execution path" };
if (input.hasQueuedWake) return { kind: "skip", reason: "issue already has a queued or deferred wake" };
if (input.hasScheduledMonitor) return { kind: "skip", reason: "issue has a scheduled future monitor" };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 PR template sections missing

The PR description does not follow the required template from .github/PULL_REQUEST_TEMPLATE.md. The following required sections are absent or incomplete:

  • Thinking Path — must trace reasoning from project context down to this specific change (5–8 blockquote steps, per CONTRIBUTING.md)
  • Risks — no explicit risk assessment (e.g., what happens if a monitor timestamp is stale or the guard fires during a legitimate recovery window)
  • Model Used — "Generated with Claude Code" in the footer is not the required section; the template asks for provider, exact model ID/version, context window, and reasoning mode

Per CONTRIBUTING.md, all PRs must use the template and fill out all sections before requesting merge.

Context Used: CONTRIBUTING.md has a guide for a good PR message ... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: server/src/services/recovery/successful-run-handoff.ts
Line: 376

Comment:
**PR template sections missing**

The PR description does not follow the required template from `.github/PULL_REQUEST_TEMPLATE.md`. The following required sections are absent or incomplete:

- **Thinking Path** — must trace reasoning from project context down to this specific change (5–8 blockquote steps, per `CONTRIBUTING.md`)
- **Risks** — no explicit risk assessment (e.g., what happens if a monitor timestamp is stale or the guard fires during a legitimate recovery window)
- **Model Used** — "Generated with Claude Code" in the footer is not the required section; the template asks for provider, exact model ID/version, context window, and reasoning mode

Per `CONTRIBUTING.md`, all PRs must use the template and fill out all sections before requesting merge.

**Context Used:** CONTRIBUTING.md has a guide for a good PR message ... ([source](https://app.greptile.com/review/custom-context?memory=instruction-0))

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant