You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today controller sessions start <project> --worktree <id> --message <text> (#190) is a fire-and-forget primitive. The CLI kicks off the agent run, prints { sessionId, url }, and exits. There is no symmetric way for the same agent (or any other agent) to:
Observe what another session is doing — is it running? which provider? what events has it emitted?
Wait for a long-running run to finish without holding a turn open for the entire duration.
Wake a session with a follow-up message at a later point, including arbitrarily in the future.
The pieces are already wired on the server:
The runtime map in server/lib/session-runtime.ts already tracks active, provider, projectId, worktreeId, plus the child process and pending approvals. It is exposed via GET /api/runtimes (bulk) and GET /api/projects/:id/sessions/:sessionId/runtime (per-session). The React UI already consumes both.
Events persist to <orchestratorHome>/projects/<name>-<hash>/events/<sessionId>.jsonl (server/lib/sessions.ts) and are read back via GET /api/projects/:id/sessions/:sessionId/events. The headless advanceSessionQueue flow (Message enqueuing + steering for Claude and Ada (unify with Codex) #113) already replays queued messages on a clean run completion without any client attached.
The per-session message queue (session-queue.ts) is already CRUD-exposed at /api/projects/:id/sessions/:sessionId/queue[/messageId].
What is missing is the CLI surface that lets an agent (or another script) reach all of this. Today the only session-aware CLI surface is sessions start, so an agent can spawn work but cannot supervise it.
Concrete motivation — "run a half-hour script, then keep going"
A common pattern we would like to support:
# Kick off a long-running build.
controller sessions start coding-orchestrator \
--worktree <w> --message "Run ./big-build.sh and summarize the failures"# Time passes. The agent's own turn has long since ended, or it is now# doing other work in a different session.
controller sessions watch coding-orchestrator <sessionId> --until terminal
# blocks until run.completed / run.failed / run.cancelled, then prints# a one-line summary + exit code.# Or wake it later (from the same or a different agent) with the next step.
controller sessions wake coding-orchestrator <sessionId> --message "Build is done; now deploy."
The wake is the same primitive as the existing in-UI "send while running" path — POST /api/projects/:id/sessions/:sessionId/queue + advanceSessionQueue — so this does not introduce a new execution model. It just makes the existing one reachable from the CLI.
Proposed surfaces
All under the existing controller CLI (cli/controller) so they live next to sessions start, mirror its argument style, and inherit the existing CONTROLLER_SERVER_URL resolution + project/worktree resolvers from #190.
controller sessions list <project> [--worktree <id>] [--include-archived]
Wraps GET /api/projects/:id/sessions. The server already returns SessionSummary[] (metadata without message history, see getSessionSummaries). Print id, title, status, provider, lastActiveAt. Lets an agent enumerate its own past work or check what is running.
controller sessions status <project> <sessionId>(optional, can fold into runtime)
Wraps GET /api/projects/:id/sessions/:sessionId/runtime. Prints the runtime snapshot: active, provider, projectId, worktreeId. Quick "is it still running?" probe.
controller sessions watch <project> <sessionId> (the headline new surface)
Two modes:
--until terminal (default): long-poll/SSE on a new GET /api/projects/:id/sessions/:sessionId/wait route. The server watches the runtime map (the same data advanceSessionQueue already uses) and resolves when run.completed / run.failed / run.cancelled lands for that session, or when the child process exits. Prints a one-line summary + the exit code and exits with the same exit code (so agents can if ! controller sessions watch ...).
--tail [N]: prints the last N events (default 20) and exits. Lets an agent quickly catch up after re-entering a session, or after coming back from a delay. Wraps GET /api/projects/:id/sessions/:sessionId/events and uses the existing dedupeUserMessageEvents from routes/sessions.ts.
controller sessions wake <project> <sessionId> --message <text>
Wraps POST /api/projects/:id/sessions/:sessionId/queue. Writes a QueuedMessage (session-queue.ts) using the same { text, provider, model, mode, ... } shape that the UI queue uses, so the existing advanceSessionQueue picks it up unchanged on clean completion. Resolves to the new messageId and exits.
controller sessions wake <project> <sessionId> --message <text> --delay <duration>(follow-up; depends on 4)
Writes a { runAt: <ISO>, ... } envelope instead of a ready-to-replay item. The server queue-advance loop checks for due items when it wakes for any reason (e.g. an existing run finishing, or a lightweight setTimeout set at insert time). This is the literal "wake me in 30 min" primitive. Should be filed separately from 1–4 if we want to keep the first PR reviewable.
Why this fits the current architecture
The runtime map and event log are already the source of truth for the React UI sidebar and SessionView. The CLI surfaces only add a read/write path; they do not add new state.
advanceSessionQueue is already server-driven and client-independent — a headless wake (no SSE client attached) drains the queue to completion. So sessions wake from another shell, cron, or agent works without any UI open.
The agent preamble (server/lib/agent-preamble.ts) and the agent system prompt already document the absolute CLI install path, so agents will discover these new subcommands via the same channel.
Non-goals
Not adding a generic scheduler / cron system. --delay is just a deferred-enqueue, not a recurring job.
Not exposing the child process handle or letting the agent kill a sibling session. controller sessions stop can be a separate surface (and POST /sessions/:id/stop already exists on the server).
Not changing the runtime map itself. The map is server-internal; the CLI surfaces read snapshots, not state.
Open questions
Should sessions watch --until terminal use SSE on the server, or short-poll GET /runtime + GET /events? SSE keeps the cost on the server (push when terminal lands); polling is simpler but chattier. SSE matches the existing pattern (every other live surface uses SSE), but the server route does not exist yet — would need a new endpoint, or reuse the existing /events SSE if we add a ?wait=terminal mode.
Should sessions wake deduplicate identical follow-ups? The existing queue is just a list, so two identical --messages will both replay.
For --delay, do we need the server to actively poll due items even when no run is active (i.e. across a full idle period), or is "deliver on the next natural run" acceptable? An idle server with no session running is the case where the agent is most likely to want a wake.
Acceptance criteria
controller sessions list <project> prints the session list with status + provider.
controller sessions watch <project> <sessionId> --until terminal blocks until the run terminates and exits with the run's exit code.
controller sessions watch <project> <sessionId> --tail prints the last N events.
controller sessions wake <project> <sessionId> --message <text> enqueues a follow-up that runs on the existing advanceSessionQueue path; verified by starting session A, kicking off a long tool call, waking with a follow-up from another shell, observing that the follow-up replays headlessly when the first run completes.
All surfaces work with the absolute install path (~/coding-orchestrator/bin/controller) and inherit the existing CONTROLLER_SERVER_URL resolution.
New server routes (if any) get tests; CLI parsing gets unit tests under cli/__tests__/.
Problem
Today
controller sessions start <project> --worktree <id> --message <text>(#190) is a fire-and-forget primitive. The CLI kicks off the agent run, prints{ sessionId, url }, and exits. There is no symmetric way for the same agent (or any other agent) to:The pieces are already wired on the server:
server/lib/session-runtime.tsalready tracksactive,provider,projectId,worktreeId, plus the child process and pending approvals. It is exposed viaGET /api/runtimes(bulk) andGET /api/projects/:id/sessions/:sessionId/runtime(per-session). The React UI already consumes both.<orchestratorHome>/projects/<name>-<hash>/events/<sessionId>.jsonl(server/lib/sessions.ts) and are read back viaGET /api/projects/:id/sessions/:sessionId/events. The headlessadvanceSessionQueueflow (Message enqueuing + steering for Claude and Ada (unify with Codex) #113) already replays queued messages on a clean run completion without any client attached.session-queue.ts) is already CRUD-exposed at/api/projects/:id/sessions/:sessionId/queue[/messageId].What is missing is the CLI surface that lets an agent (or another script) reach all of this. Today the only session-aware CLI surface is
sessions start, so an agent can spawn work but cannot supervise it.Concrete motivation — "run a half-hour script, then keep going"
A common pattern we would like to support:
The wake is the same primitive as the existing in-UI "send while running" path —
POST /api/projects/:id/sessions/:sessionId/queue+advanceSessionQueue— so this does not introduce a new execution model. It just makes the existing one reachable from the CLI.Proposed surfaces
All under the existing
controllerCLI (cli/controller) so they live next tosessions start, mirror its argument style, and inherit the existingCONTROLLER_SERVER_URLresolution + project/worktree resolvers from #190.controller sessions list <project> [--worktree <id>] [--include-archived]Wraps
GET /api/projects/:id/sessions. The server already returnsSessionSummary[](metadata without message history, seegetSessionSummaries). Printid,title,status,provider,lastActiveAt. Lets an agent enumerate its own past work or check what is running.controller sessions status <project> <sessionId>(optional, can fold intoruntime)Wraps
GET /api/projects/:id/sessions/:sessionId/runtime. Prints the runtime snapshot:active,provider,projectId,worktreeId. Quick "is it still running?" probe.controller sessions watch <project> <sessionId>(the headline new surface)Two modes:
--until terminal(default): long-poll/SSE on a newGET /api/projects/:id/sessions/:sessionId/waitroute. The server watches the runtime map (the same dataadvanceSessionQueuealready uses) and resolves whenrun.completed/run.failed/run.cancelledlands for that session, or when the child process exits. Prints a one-line summary + the exit code and exits with the same exit code (so agents canif ! controller sessions watch ...).--tail [N]: prints the lastNevents (default 20) and exits. Lets an agent quickly catch up after re-entering a session, or after coming back from a delay. WrapsGET /api/projects/:id/sessions/:sessionId/eventsand uses the existingdedupeUserMessageEventsfromroutes/sessions.ts.controller sessions wake <project> <sessionId> --message <text>Wraps
POST /api/projects/:id/sessions/:sessionId/queue. Writes aQueuedMessage(session-queue.ts) using the same{ text, provider, model, mode, ... }shape that the UI queue uses, so the existingadvanceSessionQueuepicks it up unchanged on clean completion. Resolves to the newmessageIdand exits.controller sessions wake <project> <sessionId> --message <text> --delay <duration>(follow-up; depends on 4)Writes a
{ runAt: <ISO>, ... }envelope instead of a ready-to-replay item. The server queue-advance loop checks for due items when it wakes for any reason (e.g. an existing run finishing, or a lightweightsetTimeoutset at insert time). This is the literal "wake me in 30 min" primitive. Should be filed separately from 1–4 if we want to keep the first PR reviewable.Why this fits the current architecture
advanceSessionQueueis already server-driven and client-independent — a headless wake (no SSE client attached) drains the queue to completion. Sosessions wakefrom another shell, cron, or agent works without any UI open.cli/controller(controllerCliInstalledPath,resolveProjectIdfrom Expose worktree + session start to agents (worktree the conversation, then start working on issue X) #190). Adding surfaces isparseX(argv)+runX(argv, serverUrl)+ a server route or wrapper.server/lib/agent-preamble.ts) and the agent system prompt already document the absolute CLI install path, so agents will discover these new subcommands via the same channel.Non-goals
--delayis just a deferred-enqueue, not a recurring job.controller sessions stopcan be a separate surface (andPOST /sessions/:id/stopalready exists on the server).Open questions
sessions watch --until terminaluse SSE on the server, or short-pollGET /runtime+GET /events? SSE keeps the cost on the server (push when terminal lands); polling is simpler but chattier. SSE matches the existing pattern (every other live surface uses SSE), but the server route does not exist yet — would need a new endpoint, or reuse the existing/eventsSSE if we add a?wait=terminalmode.sessions wakededuplicate identical follow-ups? The existing queue is just a list, so two identical--messages will both replay.--delay, do we need the server to actively poll due items even when no run is active (i.e. across a full idle period), or is "deliver on the next natural run" acceptable? An idle server with no session running is the case where the agent is most likely to want a wake.Acceptance criteria
controller sessions list <project>prints the session list with status + provider.controller sessions watch <project> <sessionId> --until terminalblocks until the run terminates and exits with the run's exit code.controller sessions watch <project> <sessionId> --tailprints the last N events.controller sessions wake <project> <sessionId> --message <text>enqueues a follow-up that runs on the existingadvanceSessionQueuepath; verified by starting session A, kicking off a long tool call, waking with a follow-up from another shell, observing that the follow-up replays headlessly when the first run completes.~/coding-orchestrator/bin/controller) and inherit the existingCONTROLLER_SERVER_URLresolution.cli/__tests__/.