epic-2-process.md

Epic 2: Process Management — "It Runs Programs"

Goal: Create, schedule, and manage processes. Load and run ELF binaries. Milestone: Can clone(), exec() a static ELF binary, wait4() for it. Scheduler time-slices. Estimated effort: 6-10 weeks (~1,900 lines) Prerequisites: Epic 1 complete (heap works, page mapping works, VMAs tracked)

Dependency Graph

T2.1 (process struct + PID + wait queue)
 ├── T2.2 (kernel stack + context switch asm)
 │    └── T2.3 (kernel threads + scheduler)
 │         └── T2.4 (timer preemption)
 ├── T2.5 (SYSCALL/SYSRET MSRs + dispatch table) [parallel with T2.2-T2.4]
 ├── T2.6 (ELF parser + loader) [parallel with T2.2-T2.4]
 │    └── T2.7 (user-mode entry + user pointer helpers)
 │         └── T2.8 (clone + fork)
 │              └── T2.9 (execve)
 │                   └── T2.10 (exit + wait4 + process groups)

T2.1: Process Struct + PID Allocator + Wait Queue

Objective

Define the core process data structure, a PID allocator, and a generic wait queue primitive used throughout the kernel.

Context Files

INTERFACES.md Process trait, Pid type
DECISIONS.md D12 (WaitQueue design), D13 (lock ordering)

Output Files

kernel/src/sched/mod.rs
kernel/src/sched/process.rs
kernel/src/sched/waitqueue.rs

Requirements

ProcessControlBlock struct with fields:
- pid: Pid
- parent_pid: Option<Pid>
- state: ProcessState (Ready, Running, Sleeping, Zombie)
- address_space: AddressSpace (from mm)
- fd_table: FdTable (placeholder — just the field, impl in Epic 3)
- kernel_stack: VirtAddr
- context: CpuContext (saved registers for context switch)
- exit_code: Option<i32>
- cwd: String (current working directory, default "/")
- pgid: Pid (process group ID)
- sid: Pid (session ID)
- children: Vec<Pid>
- wait_queue: WaitQueue (for parent waiting on this process)
PID allocator: simple counter with AtomicU32, starting at 1
- allocate_pid() -> Pid
- PIDs are never reused (for simplicity — wrap at MAX_PID)
Global process table: BTreeMap<Pid, Arc<Mutex<ProcessControlBlock>>>
- get_process(pid) -> Option<Arc<Mutex<ProcessControlBlock>>>
- current_process() -> Arc<Mutex<ProcessControlBlock>>
WaitQueue — generic sleep/wake primitive (DECISIONS.md D12):
- Used by: pipes (T3.9), wait4 (T2.10), accept (T5.7), nanosleep (T4.7)
- sleep() — add current PID to waiters, set state to Sleeping, call schedule()
- wake_one() — pop first waiter, set Ready, enqueue in scheduler
- wake_all() — wake every waiter

Interface Contract

// sched/process.rs
use alloc::sync::Arc;
use spin::Mutex;
use alloc::collections::BTreeMap;

#[repr(C)]
pub struct CpuContext {
    pub rsp: u64,
    pub rbp: u64,
    pub rbx: u64,
    pub r12: u64,
    pub r13: u64,
    pub r14: u64,
    pub r15: u64,
    pub rip: u64,
    pub rflags: u64,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum ProcessState {
    Ready,
    Running,
    Sleeping,
    Zombie,
}

pub struct ProcessControlBlock {
    pub pid: Pid,
    pub parent_pid: Option<Pid>,
    pub state: ProcessState,
    pub children: Vec<Pid>,
    pub wait_queue: WaitQueue,
    // ... other fields
    pub context: CpuContext,
}

pub fn allocate_pid() -> Pid;
pub fn current_pid() -> Pid;
pub fn get_process(pid: Pid) -> Option<Arc<Mutex<ProcessControlBlock>>>;
pub fn current_process() -> Arc<Mutex<ProcessControlBlock>>;

// sched/waitqueue.rs
use alloc::vec::Vec;
use spin::Mutex;

pub struct WaitQueue {
    waiters: Mutex<Vec<Pid>>,
}

impl WaitQueue {
    pub fn new() -> Self;

    /// Add current PID to waiters, set Sleeping, call schedule().
    /// Caller must NOT hold the scheduler lock (lock ordering: D13).
    pub fn sleep(&self);

    /// Wake first waiter: pop PID, set Ready, enqueue in scheduler.
    pub fn wake_one(&self) -> Option<Pid>;

    /// Wake all waiters.
    pub fn wake_all(&self) -> usize;

    /// Number of waiters currently sleeping.
    pub fn len(&self) -> usize;
}

Acceptance Criteria

Unit tests (Tier 1):
- Allocate 100 PIDs, all unique
- PCB can be created with all fields
- Process table insert/lookup works
- WaitQueue: sleep adds PID to list, wake_one removes it, wake_all clears all
- WaitQueue: wake_one on empty returns None
CpuContext is repr(C) (will be accessed from assembly)
ProcessControlBlock is Send

Agent Skills

Process data structures, PID management, Arc<Mutex<>> patterns, wait queue primitives

T2.2: Kernel Stack Allocation + Context Switch Assembly

Objective

Allocate per-thread kernel stacks and implement the low-level register save/restore for switching between kernel threads.

Context Files

kernel/src/mm/mapper.rs (from T1.4)
kernel/src/mm/frame_allocator.rs (from T1.2)
kernel/src/sched/process.rs (CpuContext struct from T2.1)

Output Files

kernel/src/sched/stack.rs
kernel/src/arch/x86_64/context.rs

Requirements

Kernel Stack Allocation:

Each kernel thread needs a stack (default 16 KiB = 4 pages)
Stack region: 0xFFFF_A000_0000_0000 + (stack_id * 0x10000) for guard page spacing
Guard page: one unmapped page below each stack to catch overflow
allocate_kernel_stack(stack_id) -> VirtAddr — maps 4 pages, returns top of stack (highest address)
deallocate_kernel_stack(stack_id) — unmaps pages, frees frames

Context Switch:

switch_context(old: &mut CpuContext, new: &CpuContext) — naked function with inline asm
Saves: rsp, rbp, rbx, r12, r13, r14, r15, rflags to old
Restores the same from new
Changes rsp to new thread's stack, jumps to new thread's rip
Uses #[naked] function with core::arch::asm!
Does NOT save/restore: rax (return value), rcx/rdx/rsi/rdi/r8-r11 (caller-saved)

Interface Contract

// sched/stack.rs
pub const KERNEL_STACK_SIZE: usize = 4 * PAGE_SIZE; // 16 KiB
pub const KERNEL_STACK_REGION_START: u64 = 0xFFFF_A000_0000_0000;

pub fn allocate_kernel_stack(
    stack_id: usize,
    mapper: &mut impl PageMapper,
    allocator: &mut impl FrameAllocator,
) -> KernelResult<VirtAddr>;

pub fn deallocate_kernel_stack(
    stack_id: usize,
    mapper: &mut impl PageMapper,
    allocator: &mut impl FrameAllocator,
) -> KernelResult<()>;

// arch/x86_64/context.rs
use crate::sched::process::CpuContext;

/// Switch from the current kernel thread to another.
/// SAFETY: Both contexts must be valid. The new context's rsp must point to a valid stack.
#[naked]
pub unsafe extern "C" fn switch_context(
    old: *mut CpuContext,
    new: *const CpuContext,
);

Acceptance Criteria

Integration test (Tier 3): allocate a stack, write to it, verify guard page fault on underflow
Stack addresses are correctly aligned (16-byte aligned for x86_64 ABI)
deallocate frees the frames (frame count increases)
Assembly compiles with #[naked] and asm! on nightly (Intel syntax chosen, documented)
Integration test (Tier 3): create two contexts, switch between them, verify both run
No register corruption (each thread sees its own register state)
CpuContext field offsets match the assembly offsets (verified by offset_of! in tests)

Agent Skills

Kernel stack layout, guard pages, virtual address space management
x86_64 inline assembly (core::arch::asm!), calling conventions, naked functions

Agent Prompt Context

Reference: System V AMD64 ABI (callee-saved registers: rbx, rbp, r12-r15, rsp)
Reference: #[naked] function requirements (no prologue/epilogue)

T2.3: Kernel Threads + Round-Robin Scheduler

Objective

Create kernel threads that execute a function and then exit. Implement a round-robin scheduler to select the next runnable process.

Context Files

INTERFACES.md Scheduler trait
kernel/src/sched/process.rs (from T2.1)
kernel/src/sched/stack.rs (from T2.2)
kernel/src/arch/x86_64/context.rs (from T2.2)
DECISIONS.md D13 (lock ordering)

Output Files

kernel/src/sched/thread.rs
kernel/src/sched/scheduler.rs

Requirements

Kernel Threads:

spawn_kernel_thread(name, function) creates a new PCB with:
- Allocated kernel stack
- CpuContext.rip pointing to a trampoline function
- CpuContext.rsp pointing to top of kernel stack
- Trampoline: calls the function, then calls thread_exit() when it returns
thread_exit() — marks process as Zombie, triggers reschedule
Kernel thread runs in ring 0, shares kernel address space (no separate page tables)
The first kernel thread is "idle" — loops on hlt

Scheduler:

Implement the Scheduler trait from INTERFACES.md
Simple FIFO run queue using VecDeque<Pid>
enqueue(pid) — add to back of queue
dequeue() — remove from front
tick() — decrement time slice counter; return true when it hits 0 (default: 10 ticks = ~100ms at 100Hz PIT)
schedule() — top-level function: save current context, pick next, restore context, switch
Skip zombie/sleeping processes in the queue

Lock ordering (D13): The scheduler lock is #1 in the global order. Never acquire it while holding a PCB lock or FD table lock. Code that calls schedule() must not hold any lock that a wakeup path might need.

Interface Contract

// sched/thread.rs
pub fn spawn_kernel_thread(
    name: &str,
    entry: fn() -> (),
) -> KernelResult<Pid>;

pub fn thread_exit(exit_code: i32) -> !;

// sched/scheduler.rs
use alloc::collections::VecDeque;

pub struct RoundRobinScheduler {
    run_queue: VecDeque<Pid>,
    current: Option<Pid>,
    ticks_remaining: u32,
    time_slice: u32,  // ticks per slice, default 10
}

impl Scheduler for RoundRobinScheduler { /* ... */ }

/// Main scheduling entry point. Called from timer interrupt or voluntary yield.
pub fn schedule();

/// Voluntarily yield the CPU to the next runnable process.
pub fn yield_now();

Acceptance Criteria

Integration test (Tier 3): spawn 3 kernel threads, each prints its PID to serial, all 3 run
Threads exit cleanly (no crash after function returns)
Thread names appear in process table
Unit tests (Tier 1):
- Enqueue PIDs 1,2,3; dequeue returns 1,2,3 in order
- Empty queue dequeue returns None
- tick() returns true after time_slice ticks
- Zombie PIDs are skipped
Integration test (Tier 3): 3 kernel threads run round-robin, each gets CPU time

Agent Skills

Kernel thread lifecycle, stack setup for new threads, trampoline functions
Scheduler design, round-robin algorithm, VecDeque, lock ordering

T2.4: Timer-Driven Preemption

Objective

Wire the PIT timer interrupt to the scheduler for preemptive multitasking.

Context Files

kernel/src/arch/x86_64/interrupts.rs (timer handler from T0.11)
kernel/src/sched/scheduler.rs (from T2.3)

Output Files

Updated kernel/src/arch/x86_64/interrupts.rs — timer calls scheduler
Updated kernel/src/sched/scheduler.rs — handle timer-driven reschedule

Requirements

Timer interrupt handler calls scheduler::tick()
If tick() returns true (time slice expired), call schedule()
Context switch happens within the interrupt handler (save state to current PCB, restore next)
Interrupts must be disabled during context switch (re-enabled after)
PIT frequency: ~100Hz (default)

Interface Contract

// Updated timer handler
extern "x86-interrupt" fn timer_interrupt_handler(stack_frame: InterruptStackFrame) {
    TICK_COUNT.fetch_add(1, Ordering::Relaxed);

    if scheduler::tick() {
        scheduler::schedule();
    }

    unsafe { PICS.lock().notify_end_of_interrupt(InterruptIndex::Timer as u8); }
}

Acceptance Criteria

Integration test (Tier 3): two CPU-bound kernel threads both make progress (neither starves)
Timer ticks continue during context switch
No deadlocks from nested locking (PIC lock vs scheduler lock)
Each thread runs for approximately time_slice ticks before preemption

Agent Skills

Preemptive scheduling, interrupt-driven context switches, lock ordering

T2.5: SYSCALL/SYSRET MSR Setup + Dispatch Table

Objective

Configure the x86_64 SYSCALL/SYSRET instructions via MSRs and build the dispatch table routing syscall numbers to handler functions.

Context Files

kernel/src/arch/x86_64/gdt.rs (from T0.7) — need kernel CS/SS selectors
INTERFACES.md syscall number constants
DECISIONS.md D5 (interrupts re-enabled inside syscalls)

Output Files

kernel/src/arch/x86_64/syscall.rs
kernel/src/syscall/mod.rs
kernel/src/syscall/misc.rs

Requirements

MSR Setup:

Write to MSRs:
- STAR (0xC0000081): set kernel CS/SS and user CS/SS segment selectors
- LSTAR (0xC0000082): set syscall entry point address
- SFMASK (0xC0000084): clear IF flag (disable interrupts) on syscall entry
- EFER (0xC0000080): set SCE bit to enable SYSCALL instruction
Syscall entry point: a naked function that:
- swapgs to access per-CPU data
- Save user RSP, switch to kernel stack
- sti to re-enable interrupts (DECISIONS.md D5 — blocking syscalls must be interruptible)
- Save all caller-saved registers
- Call the Rust syscall dispatcher
- cli before restoring user state
- Restore registers, swapgs, execute sysretq

Dispatch Table:

Dispatch table: array of Option<SyscallHandler> indexed by syscall number
dispatch(num, arg1..arg6) -> isize:
- Look up handler in table
- If found, call it, return result
- If not found, return -ENOSYS (-38)
Implement first stub syscalls (for testing):
- sys_getpid() — return current PID
- sys_uname() — fill in hardcoded utsname struct
- sys_write(fd, buf, count) — if fd==1 or fd==2, write to serial (temporary)

Interface Contract

// arch/x86_64/syscall.rs

/// Initialize SYSCALL/SYSRET MSRs. Call once during boot.
pub fn init();

/// Naked syscall entry point.
/// On entry: RAX = syscall number, RDI/RSI/RDX/R10/R8/R9 = args
/// On exit: RAX = return value
///
/// IMPORTANT: After swapgs + kernel stack switch, execute `sti` to
/// re-enable interrupts before calling dispatch. Execute `cli` before
/// restoring user state and sysretq.
#[naked]
unsafe extern "C" fn syscall_entry();

// syscall/mod.rs
const MAX_SYSCALL: usize = 512;
static SYSCALL_TABLE: [Option<SyscallHandler>; MAX_SYSCALL] = {
    let mut table = [None; MAX_SYSCALL];
    table[nr::GETPID] = Some(misc::sys_getpid);
    table[nr::UNAME] = Some(misc::sys_uname);
    table[nr::WRITE] = Some(file::sys_write);
    // ... filled in by later tasks
    table
};

pub fn dispatch(num: usize, a1: usize, a2: usize, a3: usize,
                a4: usize, a5: usize, a6: usize) -> isize;

Acceptance Criteria

MSR writes don't triple fault (verified by serial print after init)
Assembly preserves all callee-saved registers
User RSP is saved before switching to kernel stack
Interrupts are disabled on entry (SFMASK clears IF), re-enabled after kernel stack switch (sti), disabled again before sysretq (cli)
SYSRET correctly restores user RIP and RFLAGS
Unknown syscall returns -38 (ENOSYS)
sys_getpid() returns a positive integer
sys_write(1, "hello", 5) outputs "hello" to serial
Dispatch table has room for 512 entries
Unit tests (Tier 1): dispatch known number calls handler, unknown returns ENOSYS

Agent Skills

x86_64 MSRs, SYSCALL/SYSRET mechanism, naked functions, inline assembly
Syscall dispatch, function pointer tables, Linux syscall numbers

Agent Prompt Context

Reference: AMD64 Architecture Manual Vol 2, Section 6.1 (SYSCALL/SYSRET)
Reference: Intel SDM Vol 2B, SYSCALL instruction
DECISIONS.md D5 for the sti/cli placement rationale

T2.6: ELF Parser + Loader

Objective

Parse ELF64 headers and program headers, then load PT_LOAD segments into a process's address space.

Context Files

kernel/src/mm/mmap.rs (from T1.10)
kernel/src/mm/address_space.rs (from T1.9)

Output Files

kernel/src/sched/elf.rs
kernel/src/sched/loader.rs

Requirements

ELF Parser:

Parse ELF64 header: verify magic, class (64-bit), data (little-endian), machine (x86_64)
Parse program headers (PHDR): extract PT_LOAD segments with vaddr, memsz, filesz, offset, flags
Parse entry point address
Return structured data, not raw bytes

ELF Loader:

load_elf(elf_data, address_space) -> KernelResult<LoadedElf>:
- Parse ELF headers (using parser above)
- For each PT_LOAD segment:
  - Calculate page-aligned vaddr and size
  - Map pages with appropriate flags (R/W/X -> PageTableFlags)
  - Copy file data (filesz bytes from elf_data at offset)
  - Zero-fill remainder (memsz - filesz) for BSS
- Set up user stack (8 MiB, at top of user address space, e.g., 0x7FFF_FFFF_F000 downward)
- Return entry point address
Stack setup includes argc, argv, envp, auxv on the stack (Linux ABI)

Interface Contract

// sched/elf.rs
#[derive(Debug)]
pub struct ElfHeader {
    pub entry_point: u64,
    pub phdr_offset: u64,
    pub phdr_count: u16,
    pub phdr_size: u16,
}

#[derive(Debug)]
pub struct ProgramHeader {
    pub segment_type: SegmentType,
    pub offset: u64,      // offset in file
    pub vaddr: u64,       // virtual address
    pub filesz: u64,      // size in file
    pub memsz: u64,       // size in memory (may be > filesz for BSS)
    pub flags: SegmentFlags, // read/write/execute
    pub align: u64,
}

#[derive(Debug, PartialEq)]
pub enum SegmentType { Null, Load, Dynamic, Interp, Note, Other(u32) }

bitflags::bitflags! {
    pub struct SegmentFlags: u32 {
        const EXECUTE = 0x1;
        const WRITE   = 0x2;
        const READ    = 0x4;
    }
}

pub fn parse_elf(data: &[u8]) -> KernelResult<(ElfHeader, Vec<ProgramHeader>)>;

// sched/loader.rs
pub struct LoadedElf {
    pub entry_point: VirtAddr,
    pub stack_top: VirtAddr,
    pub brk_start: VirtAddr,  // end of loaded segments (for brk syscall)
}

pub fn load_elf(
    elf_data: &[u8],
    address_space: &mut AddressSpace,
    argv: &[&str],
    envp: &[&str],
    allocator: &mut impl FrameAllocator,
) -> KernelResult<LoadedElf>;

Acceptance Criteria

Unit tests (Tier 1) — critical, agent must write these:
- Parse a minimal valid ELF64 header (construct bytes manually or embed a small binary)
- Reject ELF32 (wrong class)
- Reject non-x86_64 machine
- Parse 3 program headers, verify vaddr/filesz/memsz/flags
- Handle truncated input gracefully (error, not panic)
Entry point extracted correctly
BSS detection: memsz > filesz -> zero-fill gap
A minimal static ELF binary (write syscall + exit) loads and the entry point is correct
Text segment is mapped as read + execute (not writable)
Data/BSS segment is mapped as read + write
Stack is mapped as read + write with guard page below
argv, envp, auxv are on the stack in Linux ABI format
Unit test (Tier 1): mock address space, verify correct mapping calls

Agent Skills

ELF64 format, binary parsing, bitflags
ELF loading, Linux process startup (auxv), page flag mapping

Agent Prompt Context

Reference: ELF64 spec (Tool Interface Standard, Chapter 4)

T2.7: User-Mode Entry + User Pointer Helpers

Objective

Drop from kernel mode (ring 0) to user mode (ring 3) to execute a loaded binary. Provide safe helpers for copying data between kernel and user address spaces.

Context Files

kernel/src/arch/x86_64/gdt.rs (user code/data selectors)
kernel/src/sched/loader.rs (from T2.6)
DECISIONS.md D8 (user pointer validation via page table walk)

Output Files

kernel/src/arch/x86_64/usermode.rs
kernel/src/mm/user_access.rs

Requirements

User-Mode Entry:

Add user code segment and user data segment to GDT (if not already present)
enter_usermode(entry_point, user_stack_top) — never returns:
- Push SS (user data selector | RPL 3)
- Push RSP (user stack top)
- Push RFLAGS (with IF set — interrupts enabled in user mode)
- Push CS (user code selector | RPL 3)
- Push RIP (entry point)
- Execute iretq
This is a one-way transition — kernel re-enters via syscall or interrupt

User Pointer Helpers (DECISIONS.md D8):

copy_from_user(dst, user_src, len) — copy bytes from user space to kernel buffer
copy_to_user(user_dst, src) — copy bytes from kernel buffer to user space
copy_string_from_user(user_ptr, max_len) — copy a null-terminated string from user space
Validation: walk page tables to check that all pages in the range are mapped with USER flag
Return Err(EFAULT) if any page is unmapped or lacks USER flag

Interface Contract

// arch/x86_64/usermode.rs

/// Transition to user mode. Never returns.
/// SAFETY: entry_point must be a valid user-space instruction address.
/// stack_top must be a valid user-space stack address.
pub unsafe fn enter_usermode(entry_point: VirtAddr, stack_top: VirtAddr) -> !;

// mm/user_access.rs

/// Copy `len` bytes from user-space `user_src` into kernel `dst`.
/// Returns EFAULT if any page in the range is not mapped with USER flag.
pub fn copy_from_user(dst: &mut [u8], user_src: usize, len: usize) -> KernelResult<()>;

/// Copy `src` bytes to user-space `user_dst`.
/// Returns EFAULT if any page in the range is not mapped with USER flag.
pub fn copy_to_user(user_dst: usize, src: &[u8]) -> KernelResult<()>;

/// Copy a null-terminated string from user-space, up to `max_len` bytes.
/// Returns EFAULT if any page is not mapped with USER flag.
/// Returns ENAMETOOLONG if no null terminator found within max_len.
pub fn copy_string_from_user(user_ptr: usize, max_len: usize) -> KernelResult<String>;

Acceptance Criteria

After enter_usermode, code runs in ring 3 (verified by attempting a privileged instruction -> GPF)
SYSCALL from user mode enters kernel (via T2.5 MSR setup)
Interrupts work in user mode (timer still fires)
Integration test (Tier 3): enter user mode, execute syscall for sys_write, verify output on serial
copy_from_user succeeds for valid mapped user pages
copy_from_user returns EFAULT for unmapped or kernel-only pages
copy_to_user succeeds for valid writable user pages
copy_string_from_user stops at null terminator, returns ENAMETOOLONG if none found
Unit tests (Tier 1): mock page tables, verify validation logic

Agent Skills

Ring transitions, iretq, user/kernel mode, segment selectors
Page table walking, user pointer validation

T2.8: clone() + fork()

Objective

Implement sys_clone() as the primary process-creation syscall, with sys_fork() as a thin wrapper. Deep copy address space (no CoW).

Context Files

kernel/src/sched/process.rs (from T2.1)
kernel/src/mm/address_space.rs (from T1.9)
kernel/src/mm/page_table.rs (from T1.3)
DECISIONS.md D1 (deep copy, no CoW), D2 (clone is the primitive)

Output Files

kernel/src/syscall/process.rs

Requirements

sys_clone(flags, child_stack, ptid, ctid, tls) is the real primitive:
- Allocate new PID
- Create new PCB, copy parent's fields
- Deep copy parent's address space — allocate new frames, copy page contents (DECISIONS.md D1)
- If child_stack != 0, use it as child's user RSP; otherwise clone parent's RSP
- Copy file descriptor table (all open FDs shared — Arc clone)
- Set child's return value to 0 (RAX = 0)
- Set parent's return value to child PID
- Add child to scheduler run queue
sys_fork() = sys_clone(SIGCHLD, 0, 0, 0, 0) — a simple wrapper
Child is Ready, parent continues Running
Child inherits parent's pgid and sid

Interface Contract

// syscall/process.rs

/// Primary process creation. musl libc calls this, not fork.
pub fn sys_clone(
    flags: usize,        // SIGCHLD | CLONE_* flags
    child_stack: usize,  // 0 = clone parent stack pointer
    ptid: usize,         // parent TID pointer (unused for now)
    ctid: usize,         // child TID pointer (unused for now)
    tls: usize,          // TLS base (unused for now)
    _a6: usize,
) -> isize;

/// Convenience wrapper: sys_clone(SIGCHLD, 0, 0, 0, 0).
pub fn sys_fork(
    _a1: usize, _a2: usize, _a3: usize,
    _a4: usize, _a5: usize, _a6: usize,
) -> isize;

Acceptance Criteria

After clone, child gets PID > parent PID
Child and parent have separate address spaces (write in child doesn't affect parent)
Child's RAX = 0, parent's RAX = child PID
Both parent and child are scheduled
sys_fork() produces identical behavior to sys_clone(SIGCHLD, 0, 0, 0, 0)
sys_clone with non-zero child_stack sets child RSP correctly
Integration test (Tier 3): clone, child writes to serial, parent waits

Agent Skills

Process creation, address space duplication, register manipulation for child return value
clone() flags, Linux clone vs fork semantics

T2.9: execve() — Load New Binary

Objective

Replace the current process's address space with a new ELF binary.

Context Files

kernel/src/sched/loader.rs (from T2.6)
kernel/src/syscall/process.rs (from T2.8)
kernel/src/mm/user_access.rs (from T2.7)

Output Files

Updated kernel/src/syscall/process.rs — add sys_execve

Requirements

sys_execve(path, argv, envp):
- Copy path from user space using copy_string_from_user (from T2.7)
- Copy argv/envp arrays from user space using copy_from_user
- Read ELF data from the filesystem (or initramfs for now)
- Destroy current address space (unmap all user pages, free frames)
- Load new ELF into fresh address space (using T2.6 loader)
- Set up new stack with argv, envp, auxv
- Reset signal handlers to defaults
- Close FDs with CLOEXEC flag
- Set instruction pointer to new entry point
- Does NOT return to caller — jumps to new program
Path resolution deferred to Epic 3 — for now, look up filename in a hardcoded initramfs table

Interface Contract

pub fn sys_execve(
    path_ptr: usize,   // user pointer to null-terminated path
    argv_ptr: usize,   // user pointer to null-terminated array of string pointers
    envp_ptr: usize,   // user pointer to null-terminated array of string pointers
    _a4: usize, _a5: usize, _a6: usize,
) -> isize;  // only returns on error (negative errno)

Acceptance Criteria

After execve, new program runs (different code at new entry point)
Old address space is fully freed (frame count increases)
Stack has correct argv/envp layout
Invalid path returns -ENOENT
Invalid ELF returns -ENOEXEC
User pointers are validated via copy_from_user/copy_string_from_user (no raw dereference)

Agent Skills

execve semantics, address space teardown/rebuild, argv marshaling

T2.10: exit() + wait4() + Process Groups

Objective

Implement process termination, parent notification via wait queues, and minimal process group/session management for job control.

Context Files

kernel/src/sched/process.rs (from T2.1)
kernel/src/sched/scheduler.rs (from T2.3)
kernel/src/sched/waitqueue.rs (from T2.1)
DECISIONS.md D13 (lock ordering)

Output Files

Updated kernel/src/syscall/process.rs — add sys_exit, sys_exit_group, sys_wait4
Updated kernel/src/syscall/process.rs — add setpgid, getpgid, setsid, getpid, getppid

Requirements

exit:

sys_exit(status):
- Set process state to Zombie
- Store exit status
- Release address space (unmap all, free frames)
- Close all file descriptors
- Reparent children to PID 1 (init)
- Wake parent via parent.wait_queue.wake_one() (using WaitQueue from T2.1)
- Call scheduler to pick next process

wait4:

sys_wait4(pid, status_ptr, options):
- If pid == -1: wait for any child
- If pid > 0: wait for specific child
- If child is Zombie: reap it (remove from process table), store status via copy_to_user, return child PID
- If child is still running: self.wait_queue.sleep() until child exits
- WNOHANG option: return 0 immediately if no zombie child

Process Groups:

sys_setpgid(pid, pgid) — set process group ID
sys_getpgid(pid) — get process group ID
sys_setsid() — create new session, process becomes session leader + process group leader
sys_getpid(), sys_getppid() — return PID and parent PID
Default: child inherits parent's pgid and sid on fork/clone

Lock ordering (D13): In exit path, acquire scheduler lock first, then process table lock, then individual PCB lock. In wait4 path, the WaitQueue.sleep() call invokes schedule() — caller must not hold any PCB locks when calling sleep(). Acquire PCB lock only to check state, release it before sleeping.

Interface Contract

pub fn sys_exit(status: usize, _a2: usize, _a3: usize,
                _a4: usize, _a5: usize, _a6: usize) -> isize;

pub fn sys_exit_group(status: usize, _a2: usize, _a3: usize,
                      _a4: usize, _a5: usize, _a6: usize) -> isize;

pub fn sys_wait4(pid: usize, status_ptr: usize, options: usize,
                 _a4: usize, _a5: usize, _a6: usize) -> isize;

pub fn sys_setpgid(pid: usize, pgid: usize,
                   _a3: usize, _a4: usize, _a5: usize, _a6: usize) -> isize;

pub fn sys_getpgid(pid: usize, _a2: usize, _a3: usize,
                   _a4: usize, _a5: usize, _a6: usize) -> isize;

pub fn sys_setsid(_a1: usize, _a2: usize, _a3: usize,
                  _a4: usize, _a5: usize, _a6: usize) -> isize;

pub fn sys_getpid(_a1: usize, _a2: usize, _a3: usize,
                  _a4: usize, _a5: usize, _a6: usize) -> isize;

pub fn sys_getppid(_a1: usize, _a2: usize, _a3: usize,
                   _a4: usize, _a5: usize, _a6: usize) -> isize;

Acceptance Criteria

Process exits and becomes zombie
Parent wait4 reaps zombie and gets exit status
wait4(-1, ...) returns first available zombie child
WNOHANG returns 0 when no zombie children
Orphaned children are reparented to PID 1
wait4 uses WaitQueue to sleep (not busy-poll)
getpid returns current PID
getppid returns parent PID (or 0 for init)
setsid creates new session (sid = pid = pgid)
setpgid changes process group of self or child
Integration test (Tier 3): fork, child exits with status 42, parent waits and reads 42
Unit tests (Tier 1) for pgid/sid logic

Agent Skills

Process lifecycle, zombie reaping, wait queues, parent-child relationships
POSIX process groups, sessions, job control basics, lock ordering

Epic 2 Summary

Task	Name	~Lines	Deps	Parallelizable With
T2.1	Process struct + PID + WaitQueue	250	Epic 1	—
T2.2	Kernel stack + context switch	180	T2.1	T2.5, T2.6
T2.3	Kernel threads + scheduler	270	T2.2	T2.5, T2.6
T2.4	Timer preemption	60	T2.3	T2.5, T2.6
T2.5	SYSCALL/SYSRET + dispatch	240	Epic 1, T0.7	T2.2-T2.4
T2.6	ELF parser + loader	380	Epic 1 (alloc)	T2.2-T2.5
T2.7	User-mode entry + user pointers	150	T2.6, T2.5	—
T2.8	clone() + fork()	200	T2.7, T2.3	—
T2.9	execve()	150	T2.8, T2.6	—
T2.10	exit + wait4 + process groups	260	T2.8	—
Total		~1,940

FilesExpand file tree

epic-2-process.md

Latest commit

History

epic-2-process.md

File metadata and controls

Epic 2: Process Management — "It Runs Programs"

Dependency Graph

T2.1: Process Struct + PID Allocator + Wait Queue

Objective

Context Files

Output Files

Requirements

Interface Contract

Acceptance Criteria

Agent Skills

T2.2: Kernel Stack Allocation + Context Switch Assembly

Objective

Context Files

Output Files

Requirements

Interface Contract

Acceptance Criteria

Agent Skills

Agent Prompt Context

T2.3: Kernel Threads + Round-Robin Scheduler

Objective

Context Files

Output Files

Requirements

Interface Contract

Acceptance Criteria

Agent Skills

T2.4: Timer-Driven Preemption

Objective

Context Files

Output Files

Requirements

Interface Contract

Acceptance Criteria

Agent Skills

T2.5: SYSCALL/SYSRET MSR Setup + Dispatch Table

Objective

Context Files

Output Files

Requirements

Interface Contract

Acceptance Criteria

Agent Skills

Agent Prompt Context

T2.6: ELF Parser + Loader

Objective

Context Files

Output Files

Requirements

Interface Contract

Acceptance Criteria

Agent Skills

Agent Prompt Context

T2.7: User-Mode Entry + User Pointer Helpers

Objective

Context Files

Output Files

Requirements

Interface Contract

Acceptance Criteria

Agent Skills

T2.8: clone() + fork()

Objective

Context Files

Output Files

Requirements

Interface Contract

Acceptance Criteria

Agent Skills

T2.9: execve() — Load New Binary

Objective

Context Files

Output Files

Requirements