Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit bfb3211

Browse files
krassermclaude
andauthored
Add shell command interception and approval support (#51)
* Add shell command interception and approval support Add shell_cmd_handler parameter to KernelClient and CodeExecutor for intercepting IPython ! and !! shell commands. Add approve_shell_cmds flag that routes shell commands through the ToolServer approval flow. Traceback output is rewritten to show !cmd syntax instead of internal get_ipython().system() calls, and handler frames are filtered out. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Add configurable blocking of direct subprocess/os.system calls Add block_direct_shell parameter to KernelClient and CodeExecutor. When enabled, patches subprocess.Popen and os.system in the kernel to raise RuntimeError unless called from the ! shell handler. Uses a ContextVar to track the guard state. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Refactor kernel init into dedicated module and add tests - Extract kernel init code generation to ipybox/kernel_mgr/init.py with triple-quoted string constants and build_init_code() function - Simplify KernelClient._init_kernel to thin wrapper over build_init_code - Remove shell_cmd_handler param; hardcode approval handler in init module - Rename block_direct_shell to require_shell_escape - Fix docstrings to use Markdown backticks instead of rST double backticks - Add unit tests for build_init_code and _rewrite_traceback - Add integration tests for approve_shell_cmds and require_shell_escape Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Add approve_tool_calls parameter to CodeExecutor Exposes ToolServer's approval_required as a CodeExecutor constructor arg (default True), allowing callers to disable MCP tool call approval. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Rewrite documentation to focus on unified execution model Shift narrative from "Python code execution sandbox with MCP tool calling" to "unified execution environment for Python code, shell commands and programmatic MCP tool calls." Update intro, capabilities, quickstart, architecture, and code execution docs to cover shell command execution, shell command approval, and preventing approval bypass. Add example snippets for new features. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Remove unused examples/intercept.py Functionality is covered by snippets in codexec.py and quickstart.py. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Refine documentation wording and structure Replace "unified execution model" with "unified execution interface" across all files. Restructure codeexec.md approval sections under a single heading with tool calls and shell commands as subsections. Add tool calls section before approval. Update internal docs for new modules and shell command flow. Align README, mkdocs.yml, and server.json with index.md intro. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Fix documentation inconsistencies with codebase - CLAUDE.md: IpyboxMCPServer → MCPServer (matches actual class name) - mcpserver.md: execute_ipython_cell timeout default "no timeout" → 120 - mcpserver.md: Brave Search npm package and GitHub URL to match quickstart - mcpserver.md: grammar fix ("use" → "uses") - mcp_server.py: {API_KEY} → ${API_KEY} in docstring Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Refocus documentation on unified execution model Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Upgrade mcpygen to 0.1.4 and switch from local editable to registry Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Fix format_map KeyError, add shell rejection test, extend subprocess guards - Use _ipybox_safe_dict (dict subclass with __missing__) so shell commands with {undefined_var} show the literal placeholder in the approval request instead of raising KeyError - Add integration test for shell command rejection (reject() flow) - Add integration test for undefined variable in shell commands - Extend subprocess guard to os.exec*, os.spawn*, os.posix_spawn*, pty.spawn - Add ValueError when require_shell_escape=True without approve_shell_cmds - Update docs and docstrings to reflect extended guard coverage Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
1 parent 2ea848f commit bfb3211

27 files changed

Lines changed: 979 additions & 203 deletions

AGENTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
- Source modules:
1010
- `ipybox/code_exec.py`: CodeExecutor, main API
1111
- `ipybox/kernel_mgr/`: KernelGateway and KernelClient
12-
- `ipybox/mcp_server.py`: IpyboxMCPServer
12+
- `ipybox/mcp_server.py`: MCPServer
1313
- `ipybox/utils.py`: shared utilities
1414
- Tests:
1515
- `tests/unit/`: unit tests

README.md

Lines changed: 16 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,13 @@ mcp-name: io.github.gradion-ai/ipybox
1414
<a href="https://github.com/gradion-ai/ipybox/blob/main/LICENSE"><img alt="GitHub License" src="https://img.shields.io/github/license/gradion-ai/ipybox?color=blueviolet"></a>
1515
</p>
1616

17-
[ipybox](https://gradion-ai.github.io/ipybox/) is a Python code execution sandbox with first-class support for programmatic MCP tool calling. It generates typed Python tool APIs from MCP server tool schemas, supporting both local stdio and remote HTTP servers.
17+
[ipybox](https://gradion-ai.github.io/ipybox/) is a unified execution environment for Python code, shell commands, and programmatic MCP tool calls.
1818

19-
Code that calls the generated API executes in a sandboxed IPython kernel. The API delegates MCP tool execution to a separate environment that enforces tool call approval, requiring applications to accept or reject each tool call.
19+
## Overview
20+
21+
ipybox executes code blocks in a stateful IPython kernel. A code block can contain any combination of Python code, shell commands, and programmatic MCP tool calls. Kernels can be sandboxed with [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime), enforcing filesystem and network restrictions at OS level.
22+
23+
It generates Python APIs for MCP server tools via [mcpygen](https://gradion-ai.github.io/mcpygen/), and supports application-level approval of individual tool calls and shell commands during code execution. ipybox runs locally on your computer, enabling protected access to your local data and tools.
2024

2125
> [!NOTE]
2226
> **Next generation ipybox**
@@ -34,14 +38,14 @@ Code that calls the generated API executes in a sandboxed IPython kernel. The AP
3438

3539
| Capability | Description |
3640
| --- | --- |
37-
| **Stateful code execution** | State persists across executions in IPython kernels |
38-
| **Lightweight sandboxing** | Kernel isolation via Anthropic's [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime) |
39-
| **Programmatic MCP tool calling** | MCP tools called via Python code, not JSON directly |
40-
| **MCP tool call approval** | Every MCP tool call requires application-level approval |
41-
| **Python tool API generation** | Functions and models generated from MCP tool schemas |
42-
| **Any MCP server** | Supports stdio, Streamable HTTP, and SSE transports |
43-
| **Any Python package** | Install and use any Python package in IPython kernels |
44-
| **Local code execution** | No cloud dependencies, everything runs on your machine |
41+
| **Stateful execution** | State persists across executions in IPython kernels |
42+
| **Unified execution** | Combine Python code, shell commands, and programmatic MCP tool calls in a code block |
43+
| **Shell command execution** | Run shell commands via `!cmd` syntax, capture output into Python variables |
44+
| **Programmatic MCP tool calls** | MCP tools called via generated Python API ("code mode"), not JSON directly |
45+
| **Python tool API generation** | Typed functions and Pydantic models generated from MCP tool schemas via [mcpygen](https://gradion-ai.github.io/mcpygen/) |
46+
| **Application-level approval** | Individual approval of tool calls and shell commands during code execution |
47+
| **Lightweight sandboxing** | Optional kernel isolation via Anthropic's [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime) |
48+
| **Local execution** | No cloud dependencies, everything runs locally on your machine |
4549

4650
## Usage
4751

@@ -51,13 +55,7 @@ Code that calls the generated API executes in a sandboxed IPython kernel. The AP
5155
| **[MCP server](https://gradion-ai.github.io/ipybox/mcpserver/)** | ipybox as MCP server for code actions and programmatic tool calling |
5256
| **[Claude Code plugin](https://gradion-ai.github.io/ipybox/ccplugin/)** | Plugin that bundles the ipybox MCP server and a code action skill |
5357

54-
## Agent integration
55-
56-
ipybox is designed for agents that act by executing Python code rather than issuing JSON tool calls. This [code action](https://arxiv.org/abs/2402.01030) approach enables tool composition and intermediate result processing in a single inference pass, keeping intermediate results out of the agent's context window.
57-
58-
Code actions are also key for agents to improve themselves and their tool libraries by capturing successful experience as executable knowledge. Agent-generated code cannot be trusted and requires sandboxed execution with application-level approval for every MCP tool call.
59-
6058
> [!TIP]
61-
> **freeact**
59+
> **Freeact agent**
6260
>
63-
> A code action agent built on ipybox is [freeact](https://github.com/gradion-ai/freeact). In addition to inheriting the [capabilities](#capabilities) of ipybox, it supports progressive loading of tools and [agent skills](https://agentskills.io), and can save successful code actions as tools, evolving its own tool library over time.
61+
> [Freeact](https://github.com/gradion-ai/freeact) is a general-purpose agent built on ipybox.

docs/architecture.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
# Architecture
22

3-
`CodeExecutor` coordinates three components: a Jupyter kernel for stateful Python execution, a tool server for MCP tool dispatch, and an approval channel for application-level tool call control.
3+
[`CodeExecutor`][ipybox.CodeExecutor] coordinates three components: an IPython kernel for stateful execution of Python code and shell commands, a tool server for MCP tool dispatch, and an approval channel for application-level approval of tool calls and shell commands.
44

5-
The application submits code to `CodeExecutor`, which forwards it to an IPython kernel running inside an optional sandbox. When that code calls a generated Python tool API function, the request routes to the tool server, which manages local (stdio) and remote (HTTP) MCP servers.
5+
The application submits code to `CodeExecutor`, which forwards it to an IPython kernel running inside an optional sandbox. Shell commands use IPython's `!` syntax and mix freely with Python code in a single block. When code calls a [generated](apigen.md) Python tool API function, the request routes to the tool server, which manages local (stdio) MCP servers and connections to remote (HTTP) MCP servers.
66

7-
Before executing any tool call, the tool server sends an approval request back through `CodeExecutor` to the application, blocking until it accepts or rejects. This separates code execution from tool execution, enforcing independent security boundaries: the kernel is network-isolated from MCP servers, and every tool call passes through the approval layer.
7+
Before executing any tool call, the tool server sends an approval request back through `CodeExecutor` to the application, blocking until it accepts or rejects. Shell commands go through the same approval channel when shell command approval is enabled. MCP tool execution runs outside the kernel sandbox in the tool server. Shell commands execute as kernel subprocesses inside the sandbox when enabled.
88

99
!!! info "mcpygen"
1010

1111
The code generation and tool execution infrastructure is provided by [mcpygen](https://gradion-ai.github.io/mcpygen/) and re-exported by ipybox.
1212

1313
<figure markdown>
1414
![Architecture](images/architecture-dark.png){ width="100%" }
15-
<figcaption><code>CodeExecutor</code> coordinates sandboxed code execution, tool execution, and tool call approval.</figcaption>
15+
<figcaption><code>CodeExecutor</code> coordinates sandboxed execution of Python code and shell commands, MCP tool execution, and approval of tool calls and shell commands.</figcaption>
1616
</figure>

docs/codeexec.md

Lines changed: 56 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,71 @@
11
# Code execution
22

3+
[`CodeExecutor`][ipybox.CodeExecutor] runs Python code, shell commands, and programmatic MCP tool calls in a stateful IPython kernel through a unified execution interface: all three can be combined in a single code block. Both tool calls and shell commands support application-level approval before execution.
4+
35
```python
46
--8<-- "examples/codexec.py:imports"
57
```
68

7-
[`CodeExecutor`][ipybox.CodeExecutor] runs Python code in an IPython kernel where variables and definitions persist across executions.
8-
99
## Basic execution
1010

11-
Use `execute()` for non-interactive execution where MCP tool calls, if any, are auto-approved:
11+
Use `execute()` for non-interactive execution where MCP tool calls and shell commands, if any, are auto-approved:
1212

1313
```python
1414
--8<-- "examples/codexec.py:basic_execution"
1515
```
1616

17-
For application-level approval control, use `stream()` instead.
17+
For streaming output or application-level approval control, use `stream()` instead.
1818

19-
## Tool call approval
19+
## Shell commands
2020

21-
When code calls the [generated Python tool API](apigen.md), ipybox suspends execution and yields an `ApprovalRequest`. You must call `accept()` to continue execution:
21+
Shell commands use IPython's `!` syntax:
22+
23+
```python
24+
--8<-- "examples/codexec.py:shell_commands"
25+
```
26+
27+
`!cmd` runs a shell command and prints its output. `result = !cmd` captures the output as a list of lines. Python variables are interpolated into shell commands via `{variable}` syntax. Shell commands and Python code mix freely in a single code block, for example to install packages with `!pip install` and use them immediately.
28+
29+
## Tool calls
30+
31+
ipybox can [generate typed Python APIs](apigen.md) from MCP server tool schemas. The generated code executes within the kernel, while MCP servers run on a separate [tool server](architecture.md).
32+
33+
## Approval
34+
35+
### Tool calls
36+
37+
When code calls a generated tool API, ipybox suspends execution and yields an `ApprovalRequest`. Call `accept()` to continue:
2238

2339
```python
2440
--8<-- "examples/codexec.py:basic_approval"
2541
```
2642

27-
The approval request includes `tool_name` and `tool_args` so you can inspect what's being called. Calling `reject()` raises a [`CodeExecutionError`][ipybox.CodeExecutionError].
43+
`ApprovalRequest` includes `tool_name` and `tool_args` for inspection. Calling `reject()` raises a [`CodeExecutionError`][ipybox.CodeExecutionError] containing an `ApprovalRejectedError` traceback from the kernel.
44+
45+
`approve_tool_calls` (default `True`) is set explicitly in the example above. Set it to `False` to skip approval and execute tool calls directly when using `stream()`. The `execute()` method always auto-approves tool calls regardless of this setting.
46+
47+
### Shell commands
48+
49+
Enable `approve_shell_cmds=True` to require application-level approval for shell commands:
50+
51+
```python
52+
--8<-- "examples/codexec.py:shell_approval"
53+
```
54+
55+
Each `!cmd` triggers an `ApprovalRequest` with `tool_name="shell"` and `tool_args={"cmd": "..."}`, using the same approval interface as tool calls. Variable interpolation happens before the approval request, so the application sees the fully expanded command.
56+
57+
#### Preventing bypass
58+
59+
Code could bypass shell command approval through various process-creation APIs (`subprocess`, `os.system()`, `os.exec*()`, `os.spawn*()`, `os.posix_spawn()`, `pty.spawn()`). Set `require_shell_escape=True` to guard these, forcing all shell execution through the `!` syntax where it triggers the approval flow:
60+
61+
```python
62+
--8<-- "examples/codexec.py:subprocess_blocking"
63+
```
64+
65+
With `require_shell_escape=True`, direct process-creation calls raise a `RuntimeError`. Shell commands via `!cmd` still work and go through the approval channel. Requires `approve_shell_cmds=True`.
66+
67+
!!! note
68+
These guards are Python-level guards that close the most obvious gaps. They catch accidental bypass (e.g., an LLM agent reaching for `subprocess.run`) but are not a security boundary: code running in the kernel can undo guards, call C functions via `ctypes`, or use CPython internal modules. These bypasses can be prevented at the OS level. A future version will add [sandbox](sandbox.md)-level enforcement for shell command approval.
2869

2970
## Stream output chunks
3071

@@ -34,11 +75,11 @@ Enable `chunks=True` to receive output incrementally as it's produced:
3475
--8<-- "examples/codexec.py:basic_chunks"
3576
```
3677

37-
[`CodeExecutionChunk`][ipybox.CodeExecutionChunk] events contain partial output. The final [`CodeExecutionResult`][ipybox.CodeExecutionResult] still contains the complete output.
78+
[`CodeExecutionChunk`][ipybox.CodeExecutionChunk] events contain partial output. The final [`CodeExecutionResult`][ipybox.CodeExecutionResult] contains the complete, aggregated output.
3879

3980
## Capturing plots
4081

41-
Plots are automatically captured as PNG files in the `images` directory. Use `images_dir` to customize the location:
82+
Plots are automatically captured as PNG files. Set `images_dir` to specify the output directory:
4283

4384
```python
4485
--8<-- "examples/codexec.py:basic_plotting"
@@ -59,7 +100,7 @@ Configure approval and execution timeouts:
59100

60101
## Kernel environment
61102

62-
The IPython kernel does not inherit environment variables from the parent process. You can pass them explicitly with `kernel_env`:
103+
The IPython kernel does not inherit environment variables from the parent process. Pass them with `kernel_env`:
63104

64105
```python
65106
--8<-- "examples/codexec.py:kernel_environment"
@@ -71,19 +112,19 @@ The IPython kernel does not inherit environment variables from the parent proces
71112

72113
## Kernel reset
73114

74-
Clear all variables and definitions by resetting the IPython kernel with `reset()`:
115+
`reset()` clears all variables and definitions:
75116

76117
```python
77118
--8<-- "examples/codexec.py:kernel_reset"
78119
```
79120

80121
This also stops any MCP servers started during execution. They restart lazily on their next tool call.
81122

82-
## Working directory
123+
## Resetting working directory
83124

84-
If `working_dir` is set, the kernel starts there and ipybox restores that
85-
directory after each execution. When a reset happens, ipybox prints a message
86-
in the cell output.
125+
If `working_dir` is set, the kernel starts in that directory and ipybox resets
126+
it back after each execution if code changed it. When a reset happens, ipybox
127+
prints a message in the cell output.
87128

88129
```python
89130
--8<-- "examples/codexec.py:working_directory_reset"

docs/images/architecture-dark.png

-10.6 KB
Loading
-266 KB
Binary file not shown.

docs/images/architecture-light.png

-207 KB
Binary file not shown.

docs/index.md

Lines changed: 16 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2,22 +2,26 @@
22

33
# ipybox
44

5-
ipybox is a Python code execution sandbox with first-class support for programmatic MCP tool calling. It generates typed Python tool APIs from MCP server tool schemas, supporting both local stdio and remote HTTP servers.
5+
Unified execution environment for Python code, shell commands, and programmatic MCP tool calls.
66

7-
Code that calls the generated API executes in a sandboxed IPython kernel. The API delegates MCP tool execution to a separate environment that enforces tool call approval, requiring applications to accept or reject each tool call.
7+
## Overview
8+
9+
ipybox executes code blocks in a stateful IPython kernel. A code block can contain any combination of Python code, shell commands, and programmatic MCP tool calls. Kernels can be sandboxed with [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime), enforcing filesystem and network restrictions at OS level.
10+
11+
It generates Python APIs for MCP server tools via [mcpygen](https://gradion-ai.github.io/mcpygen/), and supports application-level approval of individual tool calls and shell commands during code execution. ipybox runs locally on your computer, enabling protected access to your local data and tools.
812

913
## Capabilities
1014

1115
| Capability | Description |
1216
| --- | --- |
13-
| **Stateful code execution** | State persists across executions in IPython kernels |
14-
| **Lightweight sandboxing** | Kernel isolation via Anthropic's [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime) |
15-
| **Programmatic MCP tool calling** | MCP tools called via Python code, not JSON directly |
16-
| **MCP tool call approval** | Every MCP tool call requires application-level approval |
17-
| **Python tool API generation** | Functions and models generated from MCP tool schemas |
18-
| **Any MCP server** | Supports stdio, Streamable HTTP, and SSE transports |
19-
| **Any Python package** | Install and use any Python package in IPython kernels |
20-
| **Local code execution** | No cloud dependencies, everything runs on your machine |
17+
| **Stateful execution** | Definitions and variables persist across executions in IPython kernels |
18+
| **Unified execution** | Combine Python code, shell commands, and programmatic MCP tool calls in a code block |
19+
| **Shell command execution** | Run shell commands via `!cmd` syntax, capture output into Python variables |
20+
| **Programmatic MCP tool calls** | MCP tools called via generated Python APIs ("code mode"), not JSON directly |
21+
| **Python tool API generation** | Typed functions and Pydantic models generated from MCP tool schemas via [mcpygen](https://gradion-ai.github.io/mcpygen/) |
22+
| **Application-level approval** | Individual approval of tool calls and shell commands during code execution |
23+
| **Lightweight sandboxing** | Optional kernel isolation via Anthropic's [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime) |
24+
| **Local execution** | No cloud dependencies, everything runs locally on your machine |
2125

2226
## Usage
2327

@@ -27,12 +31,6 @@ Code that calls the generated API executes in a sandboxed IPython kernel. The AP
2731
| **[MCP server](mcpserver.md)** | ipybox as MCP server for code actions and programmatic tool calling |
2832
| **[Claude Code plugin](ccplugin.md)** | Plugin that bundles the ipybox MCP server and a code action skill |
2933

30-
## Agent integration
31-
32-
ipybox is designed for agents that act by executing Python code rather than issuing JSON tool calls. This [code action](https://arxiv.org/abs/2402.01030) approach enables tool composition and intermediate result processing in a single inference pass, keeping intermediate results out of the agent's context window.
33-
34-
Code actions are also key for agents to improve themselves and their tool libraries by capturing successful experience as executable knowledge. Agent-generated code cannot be trusted and requires sandboxed execution with application-level approval for every MCP tool call.
35-
36-
!!! tip "freeact"
34+
!!! tip "Freeact agent"
3735

38-
[Freeact](https://gradion-ai.github.io/freeact/) is an agent harness and CLI tool built on ipybox.
36+
[Freeact](https://gradion-ai.github.io/freeact/) is a general-purpose agent built on ipybox.

0 commit comments

Comments
 (0)