This file translates future.md into an execution plan.
The plan is intentionally staged so Fluxtty can keep shipping useful terminal features while gradually becoming an AI-native workspace.
- Keep the terminal and workspace strong even before advanced AI arrives.
- Avoid a premature multi-agent architecture.
- Add AI capabilities in layers instead of rewriting the product later.
- Preserve optionality for a future detached runtime / daemon model.
- Do not block terminal quality on AI work.
- Do not make the UI the source of truth.
- Do not route all future automation through xterm directly.
- Do not build the heavy orchestration layer before the lower layers are ready.
- Do not rebuild what already exists — extract and formalize it instead.
Before planning phases, be honest about what already exists:
ai-handler.tsalready has a unifiedexecuteAction()function that both the LLM and regex paths use. The workspace action layer is not missing — it is just embedded in the wrong module.pty.rsalready injects shell hooks into zsh/bash/fish and tracks CWD via OSC 7. Adding OSC 133 command lifecycle markers is a small extension of existing code, not a new project.session.rsalready has a cleanPaneInfomodel andSessionManager— butPaneInfocarriespty_pid, which couples session identity to process lifetime and will block future persistence work.plan-executor.tssilently overwrites any pending plan when a new one arrives. This is a latent bug that needs fixing before AI mode becomes more capable.- Every frontend-to-backend call uses Tauri
invoke()directly with no abstraction layer. There are already call sites inai-handler.ts,SessionManager.ts,InputBar.ts, andWaterfallArea.ts. Each new feature adds more. The daemon split requires replacing this transport — without an abstraction, that means touching every call site across the entire codebase.
Remove the architectural problems that will block every later phase. No new user-visible features. Current behavior preserved exactly.
- extract the workspace action layer from
ai-handler.ts - decouple
ai-handlerfromWaterfallArea - clean
PaneInfoof PTY process state - fix the single-pending-plan limitation in
plan-executor - move auto-rename logic out of
WaterfallArea
Introduce a transport abstraction (src/transport.ts)
Create a thin module that wraps Tauri invoke and listen behind a transport-
agnostic interface:
// src/transport.ts
export const transport = {
send<T>(cmd: string, args?: unknown): Promise<T>,
listen<T>(event: string, handler: (payload: T) => void): Promise<() => void>,
}Today, both methods delegate to Tauri. Every existing invoke() and listen()
call across the codebase is migrated to transport.send() and
transport.listen(). When the daemon split happens, only this file changes.
This is the first item in Phase 1 because every other piece of work touches IPC call sites — doing this first means those migrations land clean rather than needing a second pass.
Extract WorkspaceActions module (src/workspace/WorkspaceActions.ts)
Move executeAction() and its helpers (findPane, actionDescription) out of
ai-handler.ts into a standalone module. This module is the single path for
all workspace mutations — AI, keyboard shortcuts, and future automation all call
it. ai-handler.ts becomes a thin layer that parses intent and calls actions.
Remove the waterfallArea reference from ai-handler.ts
ai-handler.ts currently holds a WaterfallArea reference and calls
tp.writeCommand(), waterfallArea.spawnPane(), waterfallArea.splitCurrentRow(),
and tp.destroy() directly. After extraction, WorkspaceActions owns these
calls. ai-handler no longer imports or knows about WaterfallArea.
Move pty_pid out of PaneInfo (Rust)
In session.rs, remove pty_pid: u32 from PaneInfo. Move the pane-id to
PTY-pid mapping into PtyManager's own internal HashMap<u32, PtyHandle>.
SessionManager and PtyManager communicate through pane IDs only — no
process handles cross the boundary.
Replace single-pending-plan with a queue (plan-executor.ts)
Replace setPending / setPlan with a proper queue. New plans are appended;
confirmation resolves the head of the queue. The input bar shows the current
head and its position in the queue if more than one is pending.
Move auto-rename logic out of WaterfallArea
The CWD-change detection and auto-rename logic currently lives inside
WaterfallArea's sessionManager.onChange() callback (lines 39–54). Move
this into a dedicated SessionObserver that listens to SessionManager events
directly. WaterfallArea should only react to state changes for rendering —
it should not write back to SessionManager.
- no file outside
transport.tscallsinvoke()orlisten()from Tauri directly ai-handler.tsdoes not importWaterfallAreaorTerminalPane- all workspace mutations go through
WorkspaceActions PaneInfoin Rust contains no PTY process fields- a second pending plan does not silently erase the first
WaterfallAreadoes not callsessionManager.renamePane()
Make terminal and workspace state machine-readable. This is what allows AI to reason over the workspace without depending on raw ANSI text.
- extend the existing shell integration with command lifecycle markers
- surface structured terminal metadata through the session state model
- give AI a proper context API instead of manually assembled text
Add OSC 133 markers to the existing shell hooks (pty.rs)
setup_zsh(), setup_bash(), and setup_fish() already inject hooks for OSC
7 CWD tracking. Add OSC 133 A/B/C/D sequences to the same hooks:
133;A— prompt start133;B— command start (captures what the user typed)133;C— command output start133;D;exitcode— command end with exit code
Parse these sequences in the PTY output reader and update PaneInfo fields:
last_command, last_exit_code, foreground_process_state.
Extend PaneInfo with command metadata
Add to PaneInfo in session.rs:
pub last_command: Option<String>,
pub last_exit_code: Option<i32>,
pub alternate_screen: bool, // true when a TUI (vim, htop) is activeThese are populated by the OSC 133 parser, not by the frontend.
Replace manually assembled text in buildSystemPrompt()
ai-handler.ts currently builds the LLM system prompt by formatting
getAllPanes() into a text string. Replace this with a proper
WorkspaceState.serialize() call that produces a structured representation.
The AI handler should not know how workspace state is formatted.
Expose AI-friendly pane context API
Add an IPC command get_pane_context(pane_id) that returns:
- cwd
- status (idle/running)
- last command and exit code
- alternate-screen state
- role and group
buildSystemPrompt()calls a serializer, notgetAllPanes()directly- command exit codes appear in
PaneInfoafter a command completes - AI can distinguish a pane running vim (alternate screen) from an idle shell
- no screen scraping required to get basic terminal context
Keyboard shortcuts, UI buttons, and AI mode all use the same action path. This is mostly already done if Phase 1 is complete — this phase hardens it.
- formalize the action schema
- make actions loggable and replayable
- ensure no direct component-to-component mutations remain
- define a formal
WorkspaceActionTypeScript discriminated union type (currentlyParsedActionis an untyped object with a stringtypefield) - add a simple action log: each dispatched action is appended to an in-memory ring buffer with a timestamp and result
- audit
InputBarandWaterfallAreafor any remaining direct mutations to session state that bypassWorkspaceActions
WorkspaceActionis a typed discriminated union, not{ type: string; [k]: unknown }- every workspace mutation appears in the action log
- no component directly calls
invoke('session_rename', ...)or equivalent outside ofWorkspaceActions
Ship an orchestrator-style AI mode that uses the infrastructure from phases 1–3. Keep scope narrow — one AI, no child workers yet.
- AI reads structured workspace state (from Phase 2)
- AI calls workspace actions (through the module from Phase 1)
- AI uses a proper confirmation queue (from Phase 1)
- AI summarizes results in workspace terms
- update the LLM system prompt to use
WorkspaceState.serialize() - update response handling to dispatch through
WorkspaceActions - update the confirmation flow to use the action queue
- add result summarization: after a plan executes, show exit codes and
output summaries from
PaneInfo, not just "Done."
- AI mode is clearly useful for multi-session coordination
- AI mode does not require heavy orchestration backend
- AI responses reference actual command results, not just confirmations
Expand from AI-assisted workspace control into AI-managed workspace execution.
- child AI workers
- bounded worker roles
- task routing
- future workspace templates/specs
- define worker roles and ownership rules
- introduce child AI execution paths
- add task decomposition and aggregation
- support workspace templates/specs
- AI mode becomes a real control plane
- worker behavior is bounded and inspectable
- the system is still layered, not entangled
- Phase 1 fixes
PaneInfoto not carrypty_pid— this is the prerequisite for all persistence work - do not build persistence before the session/PTY boundary is clean
- prepare the Rust backend for process separation:
SessionManagershould be serializable and loadable independently ofPtyManager - avoid new code that assumes the UI window is the permanent owner of all live PTYs
- introduce a detached runtime / daemon model:
SessionManagerandPtyManagerrun in a background process; the UI connects and reconnects without killing the PTYs
In order:
- Introduce
src/transport.tsand migrate allinvoke/listencalls to it. Small file, high leverage — every subsequent step lands cleaner because of it. - Extract
WorkspaceActionsfromai-handler.ts. Unblocks AI/keyboard convergence and removes theWaterfallAreadependency from AI code. - Move
pty_pidout ofPaneInfoinsession.rs. Required before any persistence work. - Replace the single-pending-plan model in
plan-executorwith a queue. - Add OSC 133 markers to the existing shell hook injection in
pty.rs. The infrastructure is already there — this is a small addition. - Move auto-rename logic out of
WaterfallArea.
- adding new
invoke()call sites beforetransport.tsexists — each one is another file to update during the daemon split - rebuilding what already exists (
executeActionis the action layer seed — extract it, do not rewrite it) - starting Phase 2 before
PaneInfois clean of process state - treating OSC 133 as a large project (it is a small extension of existing code)
- adding new AI features while
plan-executorcan still silently drop plans - mixing runtime design, AI design, and UI polish into one large refactor
- Is the terminal/workspace foundation still improving independently of AI?
- Are we adding state and actions, or just adding more UI wiring?
- Does each phase unlock the next one cleanly?
- Is
PaneInfoclean of PTY runtime fields? - Are we preserving the option to move toward a detached runtime later?
- Is AI mode becoming a true control plane, not just a command relay?
- Can the action queue handle concurrent plans without silent data loss?
- Is all IPC going through
transport.ts, or are newinvoke()calls creeping in?