gradion-ai
diff --git a/‎AGENTS.md‎
Lines changed: 1 addition & 1 deletion b/‎AGENTS.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.md‎
Lines changed: 16 additions & 18 deletions b/‎README.md‎
Lines changed: 16 additions & 18 deletions
diff --git a/‎docs/architecture.md‎
Lines changed: 4 additions & 4 deletions b/‎docs/architecture.md‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/codeexec.md‎
Lines changed: 56 additions & 15 deletions b/‎docs/codeexec.md‎
Lines changed: 56 additions & 15 deletions
diff --git a/‎docs/images/architecture-dark.png‎
-10.6 KB b/‎docs/images/architecture-dark.png‎
-10.6 KB
diff --git a/‎docs/images/architecture-light-annotated.jpg‎
-266 KB b/‎docs/images/architecture-light-annotated.jpg‎
-266 KB
diff --git a/‎docs/images/architecture-light.png‎
-207 KB b/‎docs/images/architecture-light.png‎
-207 KB
diff --git a/‎docs/index.md‎
Lines changed: 16 additions & 18 deletions b/‎docs/index.md‎
Lines changed: 16 additions & 18 deletions
@@ -9,7 +9,7 @@
 - Source modules:
   - `ipybox/code_exec.py`: CodeExecutor, main API
   - `ipybox/kernel_mgr/`: KernelGateway and KernelClient
-  - `ipybox/mcp_server.py`: IpyboxMCPServer
+  - `ipybox/mcp_server.py`: MCPServer
   - `ipybox/utils.py`: shared utilities
 - Tests:
   - `tests/unit/`: unit tests
 
@@ -14,9 +14,13 @@ mcp-name: io.github.gradion-ai/ipybox
     <a href="https://github.com/gradion-ai/ipybox/blob/main/LICENSE"><img alt="GitHub License" src="https://img.shields.io/github/license/gradion-ai/ipybox?color=blueviolet"></a>
 </p>
 
-[ipybox](https://gradion-ai.github.io/ipybox/) is a Python code execution sandbox with first-class support for programmatic MCP tool calling. It generates typed Python tool APIs from MCP server tool schemas, supporting both local stdio and remote HTTP servers.
+[ipybox](https://gradion-ai.github.io/ipybox/) is a unified execution environment for Python code, shell commands, and programmatic MCP tool calls.
 
-Code that calls the generated API executes in a sandboxed IPython kernel. The API delegates MCP tool execution to a separate environment that enforces tool call approval, requiring applications to accept or reject each tool call.
+## Overview
+
+ipybox executes code blocks in a stateful IPython kernel. A code block can contain any combination of Python code, shell commands, and programmatic MCP tool calls. Kernels can be sandboxed with [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime), enforcing filesystem and network restrictions at OS level.
+
+It generates Python APIs for MCP server tools via [mcpygen](https://gradion-ai.github.io/mcpygen/), and supports application-level approval of individual tool calls and shell commands during code execution. ipybox runs locally on your computer, enabling protected access to your local data and tools.
 
 > [!NOTE]
 > **Next generation ipybox**
@@ -34,14 +38,14 @@ Code that calls the generated API executes in a sandboxed IPython kernel. The AP
 
 | Capability | Description |
 | --- | --- |
-| **Stateful code execution** | State persists across executions in IPython kernels |
-| **Lightweight sandboxing** | Kernel isolation via Anthropic's [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime) |
-| **Programmatic MCP tool calling** | MCP tools called via Python code, not JSON directly |
-| **MCP tool call approval** | Every MCP tool call requires application-level approval |
-| **Python tool API generation** | Functions and models generated from MCP tool schemas |
-| **Any MCP server** | Supports stdio, Streamable HTTP, and SSE transports |
-| **Any Python package** | Install and use any Python package in IPython kernels |
-| **Local code execution** | No cloud dependencies, everything runs on your machine |
+| **Stateful execution** | State persists across executions in IPython kernels |
+| **Unified execution** | Combine Python code, shell commands, and programmatic MCP tool calls in a code block |
+| **Shell command execution** | Run shell commands via `!cmd` syntax, capture output into Python variables |
+| **Programmatic MCP tool calls** | MCP tools called via generated Python API ("code mode"), not JSON directly |
+| **Python tool API generation** | Typed functions and Pydantic models generated from MCP tool schemas via [mcpygen](https://gradion-ai.github.io/mcpygen/) |
+| **Application-level approval** | Individual approval of tool calls and shell commands during code execution |
+| **Lightweight sandboxing** | Optional kernel isolation via Anthropic's [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime) |
+| **Local execution** | No cloud dependencies, everything runs locally on your machine |
 
 ## Usage
 
@@ -51,13 +55,7 @@ Code that calls the generated API executes in a sandboxed IPython kernel. The AP
 | **[MCP server](https://gradion-ai.github.io/ipybox/mcpserver/)** | ipybox as MCP server for code actions and programmatic tool calling |
 | **[Claude Code plugin](https://gradion-ai.github.io/ipybox/ccplugin/)** | Plugin that bundles the ipybox MCP server and a code action skill |
 
-## Agent integration
-
-ipybox is designed for agents that act by executing Python code rather than issuing JSON tool calls. This [code action](https://arxiv.org/abs/2402.01030) approach enables tool composition and intermediate result processing in a single inference pass, keeping intermediate results out of the agent's context window.
-
-Code actions are also key for agents to improve themselves and their tool libraries by capturing successful experience as executable knowledge. Agent-generated code cannot be trusted and requires sandboxed execution with application-level approval for every MCP tool call.
-
 > [!TIP]
-> **freeact**
+> **Freeact agent**
 >
-> A code action agent built on ipybox is [freeact](https://github.com/gradion-ai/freeact). In addition to inheriting the [capabilities](#capabilities) of ipybox, it supports progressive loading of tools and [agent skills](https://agentskills.io), and can save successful code actions as tools, evolving its own tool library over time.
+> [Freeact](https://github.com/gradion-ai/freeact) is a general-purpose agent built on ipybox.
@@ -1,16 +1,16 @@
 # Architecture
 
-`CodeExecutor` coordinates three components: a Jupyter kernel for stateful Python execution, a tool server for MCP tool dispatch, and an approval channel for application-level tool call control. 
+[`CodeExecutor`][ipybox.CodeExecutor] coordinates three components: an IPython kernel for stateful execution of Python code and shell commands, a tool server for MCP tool dispatch, and an approval channel for application-level approval of tool calls and shell commands.
 
-The application submits code to `CodeExecutor`, which forwards it to an IPython kernel running inside an optional sandbox. When that code calls a generated Python tool API function, the request routes to the tool server, which manages local (stdio) and remote (HTTP) MCP servers.
+The application submits code to `CodeExecutor`, which forwards it to an IPython kernel running inside an optional sandbox. Shell commands use IPython's `!` syntax and mix freely with Python code in a single block. When code calls a [generated](apigen.md) Python tool API function, the request routes to the tool server, which manages local (stdio) MCP servers and connections to remote (HTTP) MCP servers.
 
-Before executing any tool call, the tool server sends an approval request back through `CodeExecutor` to the application, blocking until it accepts or rejects. This separates code execution from tool execution, enforcing independent security boundaries: the kernel is network-isolated from MCP servers, and every tool call passes through the approval layer.
+Before executing any tool call, the tool server sends an approval request back through `CodeExecutor` to the application, blocking until it accepts or rejects. Shell commands go through the same approval channel when shell command approval is enabled. MCP tool execution runs outside the kernel sandbox in the tool server. Shell commands execute as kernel subprocesses inside the sandbox when enabled.
 
 !!! info "mcpygen"
 
     The code generation and tool execution infrastructure is provided by [mcpygen](https://gradion-ai.github.io/mcpygen/) and re-exported by ipybox.
 
 <figure markdown>
   ![Architecture](images/architecture-dark.png){ width="100%" }
-  <figcaption><code>CodeExecutor</code> coordinates sandboxed code execution, tool execution, and tool call approval.</figcaption>
+  <figcaption><code>CodeExecutor</code> coordinates sandboxed execution of Python code and shell commands, MCP tool execution, and approval of tool calls and shell commands.</figcaption>
 </figure>
@@ -1,30 +1,71 @@
 # Code execution
 
+[`CodeExecutor`][ipybox.CodeExecutor] runs Python code, shell commands, and programmatic MCP tool calls in a stateful IPython kernel through a unified execution interface: all three can be combined in a single code block. Both tool calls and shell commands support application-level approval before execution.
+
 ```python
 --8<-- "examples/codexec.py:imports"
 ```
 
-[`CodeExecutor`][ipybox.CodeExecutor] runs Python code in an IPython kernel where variables and definitions persist across executions.
-
 ## Basic execution
 
-Use `execute()` for non-interactive execution where MCP tool calls, if any, are auto-approved:
+Use `execute()` for non-interactive execution where MCP tool calls and shell commands, if any, are auto-approved:
 
 ```python
 --8<-- "examples/codexec.py:basic_execution"
 ```
 
-For application-level approval control, use `stream()` instead.
+For streaming output or application-level approval control, use `stream()` instead.
 
-## Tool call approval
+## Shell commands
 
-When code calls the [generated Python tool API](apigen.md), ipybox suspends execution and yields an `ApprovalRequest`. You must call `accept()` to continue execution:
+Shell commands use IPython's `!` syntax:
+
+```python
+--8<-- "examples/codexec.py:shell_commands"
+```
+
+`!cmd` runs a shell command and prints its output. `result = !cmd` captures the output as a list of lines. Python variables are interpolated into shell commands via `{variable}` syntax. Shell commands and Python code mix freely in a single code block, for example to install packages with `!pip install` and use them immediately.
+
+## Tool calls
+
+ipybox can [generate typed Python APIs](apigen.md) from MCP server tool schemas. The generated code executes within the kernel, while MCP servers run on a separate [tool server](architecture.md).
+
+## Approval
+
+### Tool calls
+
+When code calls a generated tool API, ipybox suspends execution and yields an `ApprovalRequest`. Call `accept()` to continue:
 
 ```python
 --8<-- "examples/codexec.py:basic_approval"
 ```
 
-The approval request includes `tool_name` and `tool_args` so you can inspect what's being called. Calling `reject()` raises a [`CodeExecutionError`][ipybox.CodeExecutionError].
+`ApprovalRequest` includes `tool_name` and `tool_args` for inspection. Calling `reject()` raises a [`CodeExecutionError`][ipybox.CodeExecutionError] containing an `ApprovalRejectedError` traceback from the kernel.
+
+`approve_tool_calls` (default `True`) is set explicitly in the example above. Set it to `False` to skip approval and execute tool calls directly when using `stream()`. The `execute()` method always auto-approves tool calls regardless of this setting.
+
+### Shell commands
+
+Enable `approve_shell_cmds=True` to require application-level approval for shell commands:
+
+```python
+--8<-- "examples/codexec.py:shell_approval"
+```
+
+Each `!cmd` triggers an `ApprovalRequest` with `tool_name="shell"` and `tool_args={"cmd": "..."}`, using the same approval interface as tool calls. Variable interpolation happens before the approval request, so the application sees the fully expanded command.
+
+#### Preventing bypass
+
+Code could bypass shell command approval through various process-creation APIs (`subprocess`, `os.system()`, `os.exec*()`, `os.spawn*()`, `os.posix_spawn()`, `pty.spawn()`). Set `require_shell_escape=True` to guard these, forcing all shell execution through the `!` syntax where it triggers the approval flow:
+
+```python
+--8<-- "examples/codexec.py:subprocess_blocking"
+```
+
+With `require_shell_escape=True`, direct process-creation calls raise a `RuntimeError`. Shell commands via `!cmd` still work and go through the approval channel. Requires `approve_shell_cmds=True`.
+
+!!! note
+    These guards are Python-level guards that close the most obvious gaps. They catch accidental bypass (e.g., an LLM agent reaching for `subprocess.run`) but are not a security boundary: code running in the kernel can undo guards, call C functions via `ctypes`, or use CPython internal modules. These bypasses can be prevented at the OS level. A future version will add [sandbox](sandbox.md)-level enforcement for shell command approval.
 
 ## Stream output chunks
 
@@ -34,11 +75,11 @@ Enable `chunks=True` to receive output incrementally as it's produced:
 --8<-- "examples/codexec.py:basic_chunks"
 ```
 
-[`CodeExecutionChunk`][ipybox.CodeExecutionChunk] events contain partial output. The final [`CodeExecutionResult`][ipybox.CodeExecutionResult] still contains the complete output.
+[`CodeExecutionChunk`][ipybox.CodeExecutionChunk] events contain partial output. The final [`CodeExecutionResult`][ipybox.CodeExecutionResult] contains the complete, aggregated output.
 
 ## Capturing plots
 
-Plots are automatically captured as PNG files in the `images` directory. Use `images_dir` to customize the location:
+Plots are automatically captured as PNG files. Set `images_dir` to specify the output directory:
 
 ```python
 --8<-- "examples/codexec.py:basic_plotting"
@@ -59,7 +100,7 @@ Configure approval and execution timeouts:
 
 ## Kernel environment
 
-The IPython kernel does not inherit environment variables from the parent process. You can pass them explicitly with `kernel_env`:
+The IPython kernel does not inherit environment variables from the parent process. Pass them with `kernel_env`:
 
 ```python
 --8<-- "examples/codexec.py:kernel_environment"
@@ -71,19 +112,19 @@ The IPython kernel does not inherit environment variables from the parent proces
 
 ## Kernel reset
 
-Clear all variables and definitions by resetting the IPython kernel with `reset()`:
+`reset()` clears all variables and definitions:
 
 ```python
 --8<-- "examples/codexec.py:kernel_reset"
 ```
 
 This also stops any MCP servers started during execution. They restart lazily on their next tool call.
 
-## Working directory
+## Resetting working directory
 
-If `working_dir` is set, the kernel starts there and ipybox restores that
-directory after each execution. When a reset happens, ipybox prints a message
-in the cell output.
+If `working_dir` is set, the kernel starts in that directory and ipybox resets
+it back after each execution if code changed it. When a reset happens, ipybox
+prints a message in the cell output.
 
 ```python
 --8<-- "examples/codexec.py:working_directory_reset"
 
@@ -2,22 +2,26 @@
 
 # ipybox
 
-ipybox is a Python code execution sandbox with first-class support for programmatic MCP tool calling. It generates typed Python tool APIs from MCP server tool schemas, supporting both local stdio and remote HTTP servers. 
+Unified execution environment for Python code, shell commands, and programmatic MCP tool calls.
 
-Code that calls the generated API executes in a sandboxed IPython kernel. The API delegates MCP tool execution to a separate environment that enforces tool call approval, requiring applications to accept or reject each tool call.
+## Overview
+
+ipybox executes code blocks in a stateful IPython kernel. A code block can contain any combination of Python code, shell commands, and programmatic MCP tool calls. Kernels can be sandboxed with [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime), enforcing filesystem and network restrictions at OS level.
+
+It generates Python APIs for MCP server tools via [mcpygen](https://gradion-ai.github.io/mcpygen/), and supports application-level approval of individual tool calls and shell commands during code execution. ipybox runs locally on your computer, enabling protected access to your local data and tools.
 
 ## Capabilities
 
 | Capability | Description |
 | --- | --- |
-| **Stateful code execution** | State persists across executions in IPython kernels |
-| **Lightweight sandboxing** | Kernel isolation via Anthropic's [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime) |
-| **Programmatic MCP tool calling** | MCP tools called via Python code, not JSON directly |
-| **MCP tool call approval** | Every MCP tool call requires application-level approval |
-| **Python tool API generation** | Functions and models generated from MCP tool schemas |
-| **Any MCP server** | Supports stdio, Streamable HTTP, and SSE transports |
-| **Any Python package** | Install and use any Python package in IPython kernels |
-| **Local code execution** | No cloud dependencies, everything runs on your machine |
+| **Stateful execution** | Definitions and variables persist across executions in IPython kernels |
+| **Unified execution** | Combine Python code, shell commands, and programmatic MCP tool calls in a code block |
+| **Shell command execution** | Run shell commands via `!cmd` syntax, capture output into Python variables |
+| **Programmatic MCP tool calls** | MCP tools called via generated Python APIs ("code mode"), not JSON directly |
+| **Python tool API generation** | Typed functions and Pydantic models generated from MCP tool schemas via [mcpygen](https://gradion-ai.github.io/mcpygen/) |
+| **Application-level approval** | Individual approval of tool calls and shell commands during code execution |
+| **Lightweight sandboxing** | Optional kernel isolation via Anthropic's [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime) |
+| **Local execution** | No cloud dependencies, everything runs locally on your machine |
 
 ## Usage
 
@@ -27,12 +31,6 @@ Code that calls the generated API executes in a sandboxed IPython kernel. The AP
 | **[MCP server](mcpserver.md)** | ipybox as MCP server for code actions and programmatic tool calling |
 | **[Claude Code plugin](ccplugin.md)** | Plugin that bundles the ipybox MCP server and a code action skill |
 
-## Agent integration
-
-ipybox is designed for agents that act by executing Python code rather than issuing JSON tool calls. This [code action](https://arxiv.org/abs/2402.01030) approach enables tool composition and intermediate result processing in a single inference pass, keeping intermediate results out of the agent's context window.
-
-Code actions are also key for agents to improve themselves and their tool libraries by capturing successful experience as executable knowledge. Agent-generated code cannot be trusted and requires sandboxed execution with application-level approval for every MCP tool call.
-
-!!! tip "freeact"
+!!! tip "Freeact agent"
 
-    [Freeact](https://gradion-ai.github.io/freeact/) is an agent harness and CLI tool built on ipybox.
+    [Freeact](https://gradion-ai.github.io/freeact/) is a general-purpose agent built on ipybox.